<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>2657 Productions News &#187; Geekiness</title>
	<atom:link href="http://news.mrdwab.com/category/geekiness/feed/" rel="self" type="application/rss+xml" />
	<link>http://news.mrdwab.com</link>
	<description>..:: Whereabouts and Whatabouts of the 2657 World ::..</description>
	<lastBuildDate>Sun, 08 Aug 2010 15:23:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Using the reshape package in R for pivot-table-like functionality</title>
		<link>http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/</link>
		<comments>http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/#comments</comments>
		<pubDate>Sun, 08 Aug 2010 14:56:49 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[contingency tables]]></category>
		<category><![CDATA[cross tabulation]]></category>
		<category><![CDATA[data manipulation]]></category>
		<category><![CDATA[experiments]]></category>
		<category><![CDATA[ftable]]></category>
		<category><![CDATA[pivot tables]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[reshape]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[table]]></category>
		<category><![CDATA[xtabs]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=820</guid>
		<description><![CDATA[A little more than a week ago, I wrote about creating pivot tables in Microsoft Excel and OpenOffice.org. I also mentioned that I would explain how to do similar calculations by using R. This post will explain how to achieve similar results in R by using the reshape package. I had initially started experimenting with [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-30/pivot-tables/' rel='bookmark' title='Permanent Link: Pivot Tables in Excel and OpenOffice.org Calc'>Pivot Tables in Excel and OpenOffice.org Calc</a> <small>One of the features I find useful in Excel is...</small></li>
<li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-30/r-is-like-a-giant-calculator-for-grownups/' rel='bookmark' title='Permanent Link: R is like a giant calculator for grownups'>R is like a giant calculator for grownups</a> <small>One of the things that is interesting about R is...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>A little more than a week ago, <a href="http://news.mrdwab.com/2010-07-30/pivot-tables/" title="Pivot Tables in Excel and OpenOffice.org Calc">I wrote about creating pivot tables in Microsoft Excel and OpenOffice.org</a>. I also mentioned that I would explain how to do similar calculations by using R. This post will explain how to achieve similar results in R by using the <a href="http://had.co.nz/reshape/" target="_blank">reshape</a> package.</p>
<p>I had initially started experimenting with the reshape package several months ago when I was trying to figure out <a href="http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/" title="Quickly reshaping data from "wide" to "long" formats in R">how to reshape data from wide to long formats</a>. However, once I started experimenting with it, I realized I had misunderstood what the reshape package was designed to do. Now that I finally have a grasp of what can be done using the package, I thought I would share what I&#8217;ve found using a few examples.</p>
<p><span id="more-820"></span></p>
<h2>Part 1: Contingency tables using built-in functions</h2>
<p>There are a lot of different ways to tabulate data in R. In this post, I&#8217;ll start by demonstrating how to use <code>table</code>, <code>xtabs</code>, and <code>ftable</code> before making things more interesting and using the reshape package to flip data around.</p>
<p>We&#8217;ll start with R&#8217;s built-in functions. First, we&#8217;ll get some data into R. We&#8217;ll start with <a href="http://en.wikipedia.org/wiki/Cross_tabulation#Example" target="_blank">a basic table from Wikipedia</a> that was used to demonstrate cross-tabulation. I&#8217;ve already gone and put the data in a CSV file that can be loaded using the code below.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> handedness <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.csv</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;http://news.mrdwab.com/handedness&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">table</span><span style="color:#ff0080; font-weight:bold">(</span>handedness<span style="color:#ff0080; font-weight:bold">)</span>
        Handedness
Gender   left<span style="color:#ff0080; font-weight:bold">-</span>handed right<span style="color:#ff0080; font-weight:bold">-</span>handed
  Female           <span style="color:#800080; font-weight:bold">1            5</span>
  Male             <span style="color:#800080; font-weight:bold">2            4</span></pre>
<p>Did you see how easy it was to tabulate frequencies of the data? Unfortunately, not all data is that easy to work with. Now, we&#8217;ll add a couple of columns to the data.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> fav.col <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;red&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;green&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;blue&quot;</span><span style="color:#ff0080; font-weight:bold">),</span> <span style="color:#800080; font-weight:bold">12</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">replace</span> <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> fav.shape <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;square&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;triangle&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;circle&quot;</span><span style="color:#ff0080; font-weight:bold">),</span> <span style="color:#800080; font-weight:bold">12</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">replace</span> <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> handedness.plus <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">cbind</span><span style="color:#ff0080; font-weight:bold">(</span>handedness<span style="color:#ff0080; font-weight:bold">,</span>fav.col<span style="color:#ff0080; font-weight:bold">,</span>fav.shape<span style="color:#ff0080; font-weight:bold">)</span></pre>
<p>We can use the table function again to see what our data look like. In the following line, we are using favorite color (<code>fav.col</code>) as our rows, <code>Gender</code> as our columns, and separate tables for left-handed and right-handed respondents.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">table</span><span style="color:#ff0080; font-weight:bold">(</span>handedness.plus$fav.col<span style="color:#ff0080; font-weight:bold">,</span> handedness.plus$Gender<span style="color:#ff0080; font-weight:bold">,</span>
       handedness.plus$Handedness<span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">, ,  =</span> left<span style="color:#ff0080; font-weight:bold">-</span>handed

        Female Male
  blue       <span style="color:#800080; font-weight:bold">0    1</span>
  green      <span style="color:#800080; font-weight:bold">1    1</span>
  red        <span style="color:#800080; font-weight:bold">0    0</span>

<span style="color:#ff0080; font-weight:bold">, ,  =</span> right<span style="color:#ff0080; font-weight:bold">-</span>handed

        Female Male
  blue       <span style="color:#800080; font-weight:bold">2    2</span>
  green      <span style="color:#800080; font-weight:bold">2    1</span>
  red        <span style="color:#800080; font-weight:bold">1    1</span></pre>
<p>This is OK, but the code is somewhat cumbersome. The <code>xtabs</code> function will give you the same result, but with some simpler code. Notice that we don&#8217;t need to use the &#8220;<code>$</code>&#8221; notation in the following code since the last element in our command identifies the data.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">xtabs</span><span style="color:#ff0080; font-weight:bold">(~</span> fav.col <span style="color:#ff0080; font-weight:bold">+</span> Gender <span style="color:#ff0080; font-weight:bold">+</span> Handedness<span style="color:#ff0080; font-weight:bold">,</span> handedness.plus<span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">, ,</span> Handedness <span style="color:#ff0080; font-weight:bold">=</span> left<span style="color:#ff0080; font-weight:bold">-</span>handed

       Gender
fav.col Female Male
  blue       <span style="color:#800080; font-weight:bold">0    1</span>
  green      <span style="color:#800080; font-weight:bold">1    1</span>
  red        <span style="color:#800080; font-weight:bold">0    0</span>

<span style="color:#ff0080; font-weight:bold">, ,</span> Handedness <span style="color:#ff0080; font-weight:bold">=</span> right<span style="color:#ff0080; font-weight:bold">-</span>handed

       Gender
fav.col Female Male
  blue       <span style="color:#800080; font-weight:bold">2    2</span>
  green      <span style="color:#800080; font-weight:bold">2    1</span>
  red        <span style="color:#800080; font-weight:bold">1    1</span></pre>
<p>The <code>ftable</code> (flat table) function is even more flexible. Instead of having two separate tables, for instance, you can have a single table with the same information. With the <code>ftable</code> function, you mention the data, and then identify what data you want to use for rows, and what data you want to use for your columns. In our data, we have four columns named &#8220;<code>Gender</code>&#8220;, &#8220;<code>Handedness</code>&#8220;, &#8220;<code>fav.col</code>&#8220;, and &#8220;<code>fav.shape</code>&#8220;. If we wanted a flat table version of the previous output, we want the third column as our rows, and our columns are determined by the first and third columns. Here are three examples and their output.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">ftable</span><span style="color:#ff0080; font-weight:bold">(</span>handedness.plus<span style="color:#ff0080; font-weight:bold">,</span> row.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">,</span> col.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">))</span>
        Gender          Female                     Male
        Handedness left<span style="color:#ff0080; font-weight:bold">-</span>handed right<span style="color:#ff0080; font-weight:bold">-</span>handed left<span style="color:#ff0080; font-weight:bold">-</span>handed right<span style="color:#ff0080; font-weight:bold">-</span>handed
fav.col
blue                         <span style="color:#800080; font-weight:bold">0            2           1            2</span>
green                        <span style="color:#800080; font-weight:bold">1            2           1            1</span>
red                          <span style="color:#800080; font-weight:bold">0            1           0            1</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">ftable</span><span style="color:#ff0080; font-weight:bold">(</span>handedness.plus<span style="color:#ff0080; font-weight:bold">,</span> row.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">,</span> col.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">4</span><span style="color:#ff0080; font-weight:bold">))</span>
       Handedness left<span style="color:#ff0080; font-weight:bold">-</span>handed                 right<span style="color:#ff0080; font-weight:bold">-</span>handed
       fav.shape       circle square triangle       circle square triangle
Gender
Female                      <span style="color:#800080; font-weight:bold">0      0        1            2      1        2</span>
Male                        <span style="color:#800080; font-weight:bold">1      0        1            2      1        1</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">ftable</span><span style="color:#ff0080; font-weight:bold">(</span>handedness.plus<span style="color:#ff0080; font-weight:bold">,</span> row.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">,</span> col.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">4</span><span style="color:#ff0080; font-weight:bold">)</span>
                            fav.shape circle square triangle
Gender Handedness   fav.col
Female left<span style="color:#ff0080; font-weight:bold">-</span>handed  blue                   <span style="color:#800080; font-weight:bold">0      0        0</span>
                    green                  <span style="color:#800080; font-weight:bold">0      0        1</span>
                    red                    <span style="color:#800080; font-weight:bold">0      0        0</span>
       right<span style="color:#ff0080; font-weight:bold">-</span>handed blue                   <span style="color:#800080; font-weight:bold">2      0        0</span>
                    green                  <span style="color:#800080; font-weight:bold">0      0        2</span>
                    red                    <span style="color:#800080; font-weight:bold">0      1        0</span>
Male   left<span style="color:#ff0080; font-weight:bold">-</span>handed  blue                   <span style="color:#800080; font-weight:bold">1      0        0</span>
                    green                  <span style="color:#800080; font-weight:bold">0      0        1</span>
                    red                    <span style="color:#800080; font-weight:bold">0      0        0</span>
       right<span style="color:#ff0080; font-weight:bold">-</span>handed blue                   <span style="color:#800080; font-weight:bold">2      0        0</span>
                    green                  <span style="color:#800080; font-weight:bold">0      0        1</span>
                    red                    <span style="color:#800080; font-weight:bold">0      1        0</span></pre>
<h2>Part 2: Advanced results using the reshape package</h2>
<p>These flat frequency tables are informative, but they may not provide you all the information you actually want. For instance, what if you wanted more pivot-table-like results, where you were interested in not just frequencies, but maybe also sums and averages. If that&#8217;s the case, then you really need to figure out how to use the reshape package. Here are some examples.</p>
<p>First, we&#8217;ll load the package, load some data, and preview the rows in our data. (We&#8217;ll use the same data from the <a href="http://news.mrdwab.com/2010-07-30/pivot-tables/" title="Pivot Tables in Excel and OpenOffice.org Calc">Pivot Tables in Excel and OpenOffice.org Calc</a> post.)</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">library</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">reshape</span><span style="color:#ff0080; font-weight:bold">)</span>
Loading required package<span style="color:#ff0080; font-weight:bold">:</span> plyr
Loading required package<span style="color:#ff0080; font-weight:bold">:</span> plyr
<span style="color:#ff0080; font-weight:bold">&gt;</span> book.sales <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.csv</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;http://news.mrdwab.com/booksales&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">names</span><span style="color:#ff0080; font-weight:bold">(</span>book.sales<span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;Representative&quot;</span> <span style="color:#a68500">&quot;Region&quot;</span>         <span style="color:#a68500">&quot;Month&quot;</span>          <span style="color:#a68500">&quot;Publisher&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;Subject&quot;</span>        <span style="color:#a68500">&quot;Sales&quot;</span>          <span style="color:#a68500">&quot;Margin&quot;</span>         <span style="color:#a68500">&quot;Quantity&quot;</span></pre>
<p>In order to use the reshape package, you need to &#8220;<code>melt</code>&#8221; the data. The melting function asks for id variables (<code>id.vars</code>) and measured variables (<code>measured</code>). In the case of the book sales dataset, the measured variables are &#8220;<code>Sales</code>&#8220;, &#8220;<code>Margin</code>&#8220;, and &#8220;<code>Quantity</code>&#8221; (columns 6 through 8). Using that information, we melt the data using the following code.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> m.book.sales <span style="color:#ff0080; font-weight:bold">=</span> melt<span style="color:#ff0080; font-weight:bold">(</span>book.sales<span style="color:#ff0080; font-weight:bold">,</span> id.vars <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">)</span></pre>
<p>Now, it&#8217;s time to play around and see what we can do with reshape. The counterpart to &#8220;<code>melt</code>&#8221; is &#8220;<code>cast</code>&#8220;. We first mention what dataset we are working with (<code>m.book.sales</code>) and then the reshape function that we want to use. We use the &#8220;<code>~</code>&#8221; symbol to define relationships. The &#8220;<code>~</code>&#8221; is usually read as &#8220;is described by&#8221;, but it&#8217;s best understood with some experimentation.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">)</span>
Aggregation requires fun.aggregate<span style="color:#ff0080; font-weight:bold">:</span> <span style="color:#0080c0">length</span> used <span style="color:#0080c0">as</span> default
  Region Sales Margin Quantity
<span style="color:#800080; font-weight:bold">1</span>      E    <span style="color:#800080; font-weight:bold">15     15       15</span>
<span style="color:#800080; font-weight:bold">2</span>      N   <span style="color:#800080; font-weight:bold">122    122      122</span>
<span style="color:#800080; font-weight:bold">3</span>      S    <span style="color:#800080; font-weight:bold">91     91       91</span>
<span style="color:#800080; font-weight:bold">4</span>      W    <span style="color:#800080; font-weight:bold">77     77       77</span></pre>
<p>Well, that&#8217;s not really useful. But the notice provided, &#8220;<code>Aggregation requires fun.aggregate: length used as default</code>&#8221; is helpful. What that means is that we need to tell R what functions we want to use, for instance, &#8220;<code>sum</code>&#8221; or &#8220;<code>mean</code>&#8220;. Let&#8217;s try this again.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">)</span>
  Region   Sales  Margin Quantity
<span style="color:#800080; font-weight:bold">1</span>      E  <span style="color:#800080; font-weight:bold">2267.0 1081.84      135</span>
<span style="color:#800080; font-weight:bold">2</span>      N <span style="color:#800080; font-weight:bold">18328.3 8552.22     1130</span>
<span style="color:#800080; font-weight:bold">3</span>      S <span style="color:#800080; font-weight:bold">13276.5 6230.62      813</span>
<span style="color:#800080; font-weight:bold">4</span>      W <span style="color:#800080; font-weight:bold">12078.8 5758.08      690</span></pre>
<p>Much better. But, can we make more complicated tables? Here&#8217;s one sorted first by region, then by representative.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">+</span> Representative <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">)</span>
   Region Representative   Sales  Margin Quantity
<span style="color:#800080; font-weight:bold">1</span>       E           Kunj <span style="color:#800080; font-weight:bold">1309.00  606.37       82</span>
<span style="color:#800080; font-weight:bold">2</span>       E         Rajesh  <span style="color:#800080; font-weight:bold">958.00  475.47       53</span>
<span style="color:#800080; font-weight:bold">3</span>       N        Gajanan <span style="color:#800080; font-weight:bold">5348.95 2551.27      307</span>
<span style="color:#800080; font-weight:bold">4</span>       N        Mallesh <span style="color:#800080; font-weight:bold">6931.70 3273.15      445</span>
<span style="color:#800080; font-weight:bold">5</span>       N          Priya <span style="color:#800080; font-weight:bold">4790.65 2168.95      309</span>
<span style="color:#800080; font-weight:bold">6</span>       N           Ravi <span style="color:#800080; font-weight:bold">1257.00  558.85       69</span>
<span style="color:#800080; font-weight:bold">7</span>       S        Mahanta <span style="color:#800080; font-weight:bold">3780.70 1698.01      239</span>
<span style="color:#800080; font-weight:bold">8</span>       S            Raj <span style="color:#800080; font-weight:bold">3463.80 1690.53      183</span>
<span style="color:#800080; font-weight:bold">9</span>       S           Soni <span style="color:#800080; font-weight:bold">6032.00 2842.08      391</span>
<span style="color:#800080; font-weight:bold">10</span>      W     Shreekanth <span style="color:#800080; font-weight:bold">4065.20 1930.84      222</span>
<span style="color:#800080; font-weight:bold">11</span>      W      Shreerang <span style="color:#800080; font-weight:bold">3570.20 1675.55      213</span>
<span style="color:#800080; font-weight:bold">12</span>      W      Sree Hari <span style="color:#800080; font-weight:bold">4443.40 2151.69      255</span></pre>
<p>Notice that the above code is a convenient flat table. Also, the above table works well because each representative only sells in one region. But what if we had used &#8220;<code>Region + Representative</code>&#8221; instead?</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">+</span> Publisher <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">)</span>
   Region   Publisher  Sales  Margin Quantity
<span style="color:#800080; font-weight:bold">1</span>       E  Bloomsbury  <span style="color:#800080; font-weight:bold">261.0  137.03       18</span>
<span style="color:#800080; font-weight:bold">2</span>       E McGraw<span style="color:#ff0080; font-weight:bold">-</span>Hill <span style="color:#800080; font-weight:bold">1116.0  507.45       58</span>
<span style="color:#800080; font-weight:bold">3</span>       E   Routledge  <span style="color:#800080; font-weight:bold">120.0   64.80        6</span>
<span style="color:#800080; font-weight:bold">4</span>       E        SAGE  <span style="color:#800080; font-weight:bold">610.0  289.36       43</span>
<span style="color:#800080; font-weight:bold">5</span>       E        Viva  <span style="color:#800080; font-weight:bold">160.0   83.20       10</span>
<span style="color:#800080; font-weight:bold">6</span>       N  Bloomsbury <span style="color:#800080; font-weight:bold">2812.5 1267.79      237</span>
<span style="color:#800080; font-weight:bold">7</span>       N McGraw<span style="color:#ff0080; font-weight:bold">-</span>Hill <span style="color:#800080; font-weight:bold">3901.8 1783.68      193</span>
<span style="color:#800080; font-weight:bold">8</span>       N     Penguin  <span style="color:#800080; font-weight:bold">864.0  403.56       54</span>
<span style="color:#800080; font-weight:bold">9</span>       N   Routledge <span style="color:#800080; font-weight:bold">3741.0 1690.60      174</span>
<span style="color:#800080; font-weight:bold">10</span>      N        SAGE <span style="color:#800080; font-weight:bold">5484.8 2629.49      380</span>
<span style="color:#800080; font-weight:bold">11</span>      N        Viva <span style="color:#800080; font-weight:bold">1524.2  777.10       92</span>
<span style="color:#800080; font-weight:bold">12</span>      S  Bloomsbury <span style="color:#800080; font-weight:bold">1335.0  595.01      114</span>
<span style="color:#800080; font-weight:bold">13</span>      S McGraw<span style="color:#ff0080; font-weight:bold">-</span>Hill <span style="color:#800080; font-weight:bold">2455.8 1123.62      125</span>
<span style="color:#800080; font-weight:bold">14</span>      S     Penguin  <span style="color:#800080; font-weight:bold">972.0  429.30       67</span>
<span style="color:#800080; font-weight:bold">15</span>      S   Routledge <span style="color:#800080; font-weight:bold">2688.0 1292.60      123</span>
<span style="color:#800080; font-weight:bold">16</span>      S        SAGE <span style="color:#800080; font-weight:bold">4640.6 2206.77      320</span>
<span style="color:#800080; font-weight:bold">17</span>      S        Viva <span style="color:#800080; font-weight:bold">1185.1  583.32       64</span>
<span style="color:#800080; font-weight:bold">18</span>      W  Bloomsbury  <span style="color:#800080; font-weight:bold">787.5  371.21       63</span>
<span style="color:#800080; font-weight:bold">19</span>      W McGraw<span style="color:#ff0080; font-weight:bold">-</span>Hill <span style="color:#800080; font-weight:bold">4299.0 1944.25      203</span>
<span style="color:#800080; font-weight:bold">20</span>      W     Penguin  <span style="color:#800080; font-weight:bold">414.0  188.82       29</span>
<span style="color:#800080; font-weight:bold">21</span>      W   Routledge <span style="color:#800080; font-weight:bold">1834.0  904.78       85</span>
<span style="color:#800080; font-weight:bold">22</span>      W        SAGE <span style="color:#800080; font-weight:bold">3540.3 1733.46      246</span>
<span style="color:#800080; font-weight:bold">23</span>      W        Viva <span style="color:#800080; font-weight:bold">1204.0  615.56       64</span></pre>
<p>Notice that now things are not so tidy. Books from a given publisher are sold in multiple districts. By sticking another &#8220;<code>~</code>&#8221; into the code after &#8220;<code>variable</code>&#8220;, we can get some separated tables. Compare the following table with the previous one.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">~</span> variable <span style="color:#ff0080; font-weight:bold">~</span> Publisher<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">, ,</span> Publisher <span style="color:#ff0080; font-weight:bold">=</span> Bloomsbury

      variable
Region  Sales  Margin Quantity
     E  <span style="color:#800080; font-weight:bold">261.0  137.03       18</span>
     N <span style="color:#800080; font-weight:bold">2812.5 1267.79      237</span>
     S <span style="color:#800080; font-weight:bold">1335.0  595.01      114</span>
     W  <span style="color:#800080; font-weight:bold">787.5  371.21       63</span>

<span style="color:#ff0080; font-weight:bold">, ,</span> Publisher <span style="color:#ff0080; font-weight:bold">=</span> McGraw<span style="color:#ff0080; font-weight:bold">-</span>Hill

      variable
Region  Sales  Margin Quantity
     E <span style="color:#800080; font-weight:bold">1116.0  507.45       58</span>
     N <span style="color:#800080; font-weight:bold">3901.8 1783.68      193</span>
     S <span style="color:#800080; font-weight:bold">2455.8 1123.62      125</span>
     W <span style="color:#800080; font-weight:bold">4299.0 1944.25      203</span>

<span style="color:#ff0080; font-weight:bold">, ,</span> Publisher <span style="color:#ff0080; font-weight:bold">=</span> Penguin

      variable
Region Sales Margin Quantity
     E     <span style="color:#800080; font-weight:bold">0   0.00        0</span>
     N   <span style="color:#800080; font-weight:bold">864 403.56       54</span>
     S   <span style="color:#800080; font-weight:bold">972 429.30       67</span>
     W   <span style="color:#800080; font-weight:bold">414 188.82       29</span>

<span style="color:#ff0080; font-weight:bold">, ,</span> Publisher <span style="color:#ff0080; font-weight:bold">=</span> Routledge

      variable
Region Sales  Margin Quantity
     E   <span style="color:#800080; font-weight:bold">120   64.80        6</span>
     N  <span style="color:#800080; font-weight:bold">3741 1690.60      174</span>
     S  <span style="color:#800080; font-weight:bold">2688 1292.60      123</span>
     W  <span style="color:#800080; font-weight:bold">1834  904.78       85</span>

<span style="color:#ff0080; font-weight:bold">, ,</span> Publisher <span style="color:#ff0080; font-weight:bold">=</span> SAGE

      variable
Region  Sales  Margin Quantity
     E  <span style="color:#800080; font-weight:bold">610.0  289.36       43</span>
     N <span style="color:#800080; font-weight:bold">5484.8 2629.49      380</span>
     S <span style="color:#800080; font-weight:bold">4640.6 2206.77      320</span>
     W <span style="color:#800080; font-weight:bold">3540.3 1733.46      246</span>

<span style="color:#ff0080; font-weight:bold">, ,</span> Publisher <span style="color:#ff0080; font-weight:bold">=</span> Viva

      variable
Region  Sales Margin Quantity
     E  <span style="color:#800080; font-weight:bold">160.0  83.20       10</span>
     N <span style="color:#800080; font-weight:bold">1524.2 777.10       92</span>
     S <span style="color:#800080; font-weight:bold">1185.1 583.32       64</span>
     W <span style="color:#800080; font-weight:bold">1204.0 615.56       64</span></pre>
<p>What if we wanted totals? To get that, we need to add &#8220;<code>margins</code>&#8221; to our code.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">,</span> margins<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;grand_col&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;grand_row&quot;</span><span style="color:#ff0080; font-weight:bold">))</span>
  Region   Sales   Margin Quantity    <span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">all</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#800080; font-weight:bold">1</span>      E  <span style="color:#800080; font-weight:bold">2267.0  1081.84      135  3483.84</span>
<span style="color:#800080; font-weight:bold">2</span>      N <span style="color:#800080; font-weight:bold">18328.3  8552.22     1130 28010.52</span>
<span style="color:#800080; font-weight:bold">3</span>      S <span style="color:#800080; font-weight:bold">13276.5  6230.62      813 20320.12</span>
<span style="color:#800080; font-weight:bold">4</span>      W <span style="color:#800080; font-weight:bold">12078.8  5758.08      690 18526.88</span>
<span style="color:#800080; font-weight:bold">5</span>  <span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">all</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#800080; font-weight:bold">45950.6 21622.76     2768 70341.36</span></pre>
<p>And, what if we wanted both sum and mean? We can set up multiple functions as follows.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">mean</span><span style="color:#ff0080; font-weight:bold">))</span>
  Region Sales_sum Sales_mean Margin_sum Margin_mean Quantity_sum Quantity_mean
<span style="color:#800080; font-weight:bold">1</span>      E    <span style="color:#800080; font-weight:bold">2267.0   151.1333    1081.84    72.12267          135      9.000000</span>
<span style="color:#800080; font-weight:bold">2</span>      N   <span style="color:#800080; font-weight:bold">18328.3   150.2320    8552.22    70.10016         1130      9.262295</span>
<span style="color:#800080; font-weight:bold">3</span>      S   <span style="color:#800080; font-weight:bold">13276.5   145.8956    6230.62    68.46835          813      8.934066</span>
<span style="color:#800080; font-weight:bold">4</span>      W   <span style="color:#800080; font-weight:bold">12078.8   156.8675    5758.08    74.78026          690      8.961039</span></pre>
<p>Of course, we can also subset our data so that we just get information on selected variables. You may remember from the original book sales dataset that the &#8220;<code>variables</code>&#8221; are &#8220;<code>Sales</code>&#8220;, &#8220;<code>Margin</code>&#8220;, and &#8220;<code>Quantity</code>&#8220;. We can use that information to cast different summary tables. Here are two more examples.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">+</span> Representative <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span>
      <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">length</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">mean</span><span style="color:#ff0080; font-weight:bold">),</span> <span style="color:#0080c0">subset</span> <span style="color:#ff0080; font-weight:bold">=</span> variable <span style="color:#ff0080; font-weight:bold">%</span><span style="color:#bb7977; font-weight:bold">in</span><span style="color:#ff0080; font-weight:bold">%</span> <span style="color:#a68500">&quot;Sales&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
   Region Representative Sales_length Sales_sum Sales_mean
<span style="color:#800080; font-weight:bold">1</span>       E           Kunj            <span style="color:#800080; font-weight:bold">8   1309.00   163.6250</span>
<span style="color:#800080; font-weight:bold">2</span>       E         Rajesh            <span style="color:#800080; font-weight:bold">7    958.00   136.8571</span>
<span style="color:#800080; font-weight:bold">3</span>       N        Gajanan           <span style="color:#800080; font-weight:bold">33   5348.95   162.0894</span>
<span style="color:#800080; font-weight:bold">4</span>       N        Mallesh           <span style="color:#800080; font-weight:bold">47   6931.70   147.4830</span>
<span style="color:#800080; font-weight:bold">5</span>       N          Priya           <span style="color:#800080; font-weight:bold">35   4790.65   136.8757</span>
<span style="color:#800080; font-weight:bold">6</span>       N           Ravi            <span style="color:#800080; font-weight:bold">7   1257.00   179.5714</span>
<span style="color:#800080; font-weight:bold">7</span>       S        Mahanta           <span style="color:#800080; font-weight:bold">24   3780.70   157.5292</span>
<span style="color:#800080; font-weight:bold">8</span>       S            Raj           <span style="color:#800080; font-weight:bold">23   3463.80   150.6000</span>
<span style="color:#800080; font-weight:bold">9</span>       S           Soni           <span style="color:#800080; font-weight:bold">44   6032.00   137.0909</span>
<span style="color:#800080; font-weight:bold">10</span>      W     Shreekanth           <span style="color:#800080; font-weight:bold">26   4065.20   156.3538</span>
<span style="color:#800080; font-weight:bold">11</span>      W      Shreerang           <span style="color:#800080; font-weight:bold">22   3570.20   162.2818</span>
<span style="color:#800080; font-weight:bold">12</span>      W      Sree Hari           <span style="color:#800080; font-weight:bold">29   4443.40   153.2207</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> cast<span style="color:#ff0080; font-weight:bold">(</span>m.book.sales<span style="color:#ff0080; font-weight:bold">,</span> Region <span style="color:#ff0080; font-weight:bold">+</span> Representative <span style="color:#ff0080; font-weight:bold">~</span> variable<span style="color:#ff0080; font-weight:bold">,</span>
      <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">mean</span><span style="color:#ff0080; font-weight:bold">),</span> <span style="color:#0080c0">subset</span> <span style="color:#ff0080; font-weight:bold">=</span> variable <span style="color:#ff0080; font-weight:bold">%</span><span style="color:#bb7977; font-weight:bold">in</span><span style="color:#ff0080; font-weight:bold">%</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;Sales&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Quantity&quot;</span><span style="color:#ff0080; font-weight:bold">))</span>
   Region Representative Sales_sum Sales_mean Quantity_sum Quantity_mean
<span style="color:#800080; font-weight:bold">1</span>       E           Kunj   <span style="color:#800080; font-weight:bold">1309.00   163.6250           82     10.250000</span>
<span style="color:#800080; font-weight:bold">2</span>       E         Rajesh    <span style="color:#800080; font-weight:bold">958.00   136.8571           53      7.571429</span>
<span style="color:#800080; font-weight:bold">3</span>       N        Gajanan   <span style="color:#800080; font-weight:bold">5348.95   162.0894          307      9.303030</span>
<span style="color:#800080; font-weight:bold">4</span>       N        Mallesh   <span style="color:#800080; font-weight:bold">6931.70   147.4830          445      9.468085</span>
<span style="color:#800080; font-weight:bold">5</span>       N          Priya   <span style="color:#800080; font-weight:bold">4790.65   136.8757          309      8.828571</span>
<span style="color:#800080; font-weight:bold">6</span>       N           Ravi   <span style="color:#800080; font-weight:bold">1257.00   179.5714           69      9.857143</span>
<span style="color:#800080; font-weight:bold">7</span>       S        Mahanta   <span style="color:#800080; font-weight:bold">3780.70   157.5292          239      9.958333</span>
<span style="color:#800080; font-weight:bold">8</span>       S            Raj   <span style="color:#800080; font-weight:bold">3463.80   150.6000          183      7.956522</span>
<span style="color:#800080; font-weight:bold">9</span>       S           Soni   <span style="color:#800080; font-weight:bold">6032.00   137.0909          391      8.886364</span>
<span style="color:#800080; font-weight:bold">10</span>      W     Shreekanth   <span style="color:#800080; font-weight:bold">4065.20   156.3538          222      8.538462</span>
<span style="color:#800080; font-weight:bold">11</span>      W      Shreerang   <span style="color:#800080; font-weight:bold">3570.20   162.2818          213      9.681818</span>
<span style="color:#800080; font-weight:bold">12</span>      W      Sree Hari   <span style="color:#800080; font-weight:bold">4443.40   153.2207          255      8.793103</span></pre>
<p>Who needs pivot tables in Excel?</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-30/pivot-tables/' rel='bookmark' title='Permanent Link: Pivot Tables in Excel and OpenOffice.org Calc'>Pivot Tables in Excel and OpenOffice.org Calc</a> <small>One of the features I find useful in Excel is...</small></li>
<li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-30/r-is-like-a-giant-calculator-for-grownups/' rel='bookmark' title='Permanent Link: R is like a giant calculator for grownups'>R is like a giant calculator for grownups</a> <small>One of the things that is interesting about R is...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Pivot Tables in Excel and OpenOffice.org Calc</title>
		<link>http://news.mrdwab.com/2010-07-30/pivot-tables/</link>
		<comments>http://news.mrdwab.com/2010-07-30/pivot-tables/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 19:34:42 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[OpenOffice.org]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[data pilot]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[pivot tables]]></category>
		<category><![CDATA[summarizing data]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=785</guid>
		<description><![CDATA[One of the features I find useful in Excel is the ability to create &#8220;pivot&#8221; tables. Essentially pivot tables let you summarize big tables of data in different ways, using different variables to &#8220;pivot&#8221; your data around (hence the name, I guess). Pivot tables are most easily understood through an example, so here&#8217;s one done [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/' rel='bookmark' title='Permanent Link: Using the reshape package in R for pivot-table-like functionality'>Using the reshape package in R for pivot-table-like functionality</a> <small>A little more than a week ago, I wrote about...</small></li>
<li><a href='http://news.mrdwab.com/2009-09-21/drop-caps-with-openoffice-org-writer/' rel='bookmark' title='Permanent Link: Drop caps with OpenOffice.org Writer'>Drop caps with OpenOffice.org Writer</a> <small>Tutorial Level: Elementary From time to time, I like some...</small></li>
<li><a href='http://news.mrdwab.com/2009-07-26/automation-with-openoffice-org-writer/' rel='bookmark' title='Permanent Link: Automation with OpenOffice.org Writer'>Automation with OpenOffice.org Writer</a> <small>Tutorial Level: Intermediate At the Tata-Dhan Academy where I work,...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/02-Insert-Pivot-Table.jpg" alt="" title="02 - Insert Pivot Table" width="165" height="142" class="alignright size-full wp-image-786" />One of the features I find useful in Excel is the ability to create &#8220;pivot&#8221; tables. Essentially pivot tables let you summarize big tables of data in different ways, using different variables to &#8220;pivot&#8221; your data around (hence the name, I guess). Pivot tables are most easily understood through an example, so here&#8217;s one done using Excel 2007, and the sort-of-equivalent &#8220;Data-Pilot&#8221; in OpenOffice.org Calc (OO.o Calc).</p>
<p><span id="more-785"></span></p>
<p>Below is the data we&#8217;ll be working with. As you can see, it&#8217;s a long spreadsheet with eight columns (Representative, Region, Month, Publisher, Subject, Sales, Margin, and Quantity) and over 300 rows. Certain calculations, like total sum of sales, are easy&#8211;you just select the sales column and use Excel&#8217;s or OO.o Calc&#8217;s sum function. But what if you wanted total sum of sales, but organized first by region, then by representative? That&#8217;s where pivot tables come into play, so let&#8217;s get started!</p>
<p><center><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdHFnVVh3aDA2LXpVR2ZLM3BnZ2dhdUE&#038;hl=en&#038;single=true&#038;gid=0&#038;output=html&#038;widget=true'></iframe></center></p>
<p><a href="http://news.mrdwab.com/booksales">Download the CSV file</a> and open it up in Excel. Select all the data, jump over to the &#8220;insert&#8221; menu, and click on &#8220;PivotTable&#8221;. This will open a dialog box similar to the following.</p>
<div id="attachment_787" class="wp-caption aligncenter" style="width: 409px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/07/03-Create-Pivot-Table-dialog.jpg" rel="lightbox[785]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/03-Create-Pivot-Table-dialog.jpg" alt="Create PivotTable Dialog Box" title="Create PivotTable Dialog Box" width="399" height="289" class="size-full wp-image-787" /></a><p class="wp-caption-text">Create PivotTable Dialog Box</p></div>
<p>I usually select the option to insert the pivot table in a new sheet, and this brings us to the following screen.</p>
<div id="attachment_788" class="wp-caption aligncenter" style="width: 410px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/07/04-The-pivot-table-screen.jpg" rel="lightbox[785]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/04-The-pivot-table-screen-400x213.jpg" alt="Empty PivotTable screen" title="Empty PivotTable screen" width="400" height="213" class="size-medium wp-image-788" /></a><p class="wp-caption-text">Empty PivotTable screen</p></div>
<div id="attachment_790" class="wp-caption alignright" style="width: 150px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/07/05-Example-panel-selections.jpg" rel="lightbox[785]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/05-Example-panel-selections-182x400.jpg" alt="Example panel selections" title="Example panel selections" width="140" class="size-medium wp-image-790" /></a><p class="wp-caption-text">Example panel selections</p></div>Pretty plain looking, right?</p>
<p>The important part is the &#8220;PivotTable Field List&#8221; menu to the right of the screen. In the top half, you have a list of the variables in your data. The bottom half is where the &#8220;pivoting&#8221; gets set up simply by dragging the variables into the relevant areas in the bottom. In this example, I started by dragging &#8220;Region&#8221; to be the primary way to summarize the rows, and I dragged &#8220;Representative&#8221; below that to indicate that the rows should be further sorted by the sales representatives, and finally I dragged &#8220;Sales&#8221;, &#8220;Margin&#8221;, and &#8220;Quantity&#8221; to the sum value box. </p>
<p>By default, Excel assumes you want the sum, but you can also do different data summaries by right-clicking on the variables that you&#8217;ve dragged to the &#8220;values&#8221; corner. Whatever you drag into the filters area will create additional filter options for your data.</p>
<p>Experiment a little bit&#8211;drag things around a bit and see what types of consolidated results you end up with. Whatever you change in the PivotTable Field List dialog area is immediately reflected in the main spreadsheet area (which is one huge advantage that Excel 2007 has over OO.o Calc).<br />
<br style="clear:both" /><br />
By dragging some variables around, here&#8217;s what we can quickly end up with:</p>
<p><div id="attachment_789" class="wp-caption aligncenter" style="width: 407px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/07/06-Example-Output.jpg" rel="lightbox[785]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/06-Example-Output.jpg" alt="Example PivotTable Output" title="Example PivotTable Output" width="397" height="420" class="size-full wp-image-789" /></a><p class="wp-caption-text">Example PivotTable Output</p></div>
<p>The process in OpenOffice.org Calc is pretty similar. The option can be found under the &#8220;Data&#8221; menu under &#8220;DataPilot&#8221;. The screencap below shows the DataPilot options window with some variables dragged into the relevant areas to create a simple pivot table in OO.o Calc.</p>
<div id="attachment_797" class="wp-caption aligncenter" style="width: 410px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/07/b03-example-panel-selections.jpg" rel="lightbox[785]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/b03-example-panel-selections-400x318.jpg" alt="OO.o Calc DataPilot with some options filled in" title="OO.o Calc DataPilot with some options filled in" width="400" height="318" class="size-medium wp-image-797" /></a><p class="wp-caption-text">OO.o Calc DataPilot with some options filled in</p></div>
<p>The &#8220;Page Fields&#8221; region in OO.o Calc is the equivalent of the filters area in Excel 2007. The rest is pretty much the same, just not as pretty. The output (in the screencap below) isn&#8217;t as pretty either, but it serves its function just fine.</p>
<div id="attachment_798" class="wp-caption aligncenter" style="width: 273px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/07/b04-example-output.jpg" rel="lightbox[785]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/07/b04-example-output-263x400.jpg" alt="OO.o Calc&#039;s less pretty but still functional output" title="OO.o Calc&#039;s less pretty but still functional output" width="263" height="400" class="size-medium wp-image-798" /></a><p class="wp-caption-text">OO.o Calc's less pretty but still functional output</p></div>
<p>So, now that you know all about pivot tables in Excel and OO.o Calc, you can have a data sorting and summarizing party. Once you get bored with that, you can sit around impatiently and wait for me to write about how you can do this sort of thing with R.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/' rel='bookmark' title='Permanent Link: Using the reshape package in R for pivot-table-like functionality'>Using the reshape package in R for pivot-table-like functionality</a> <small>A little more than a week ago, I wrote about...</small></li>
<li><a href='http://news.mrdwab.com/2009-09-21/drop-caps-with-openoffice-org-writer/' rel='bookmark' title='Permanent Link: Drop caps with OpenOffice.org Writer'>Drop caps with OpenOffice.org Writer</a> <small>Tutorial Level: Elementary From time to time, I like some...</small></li>
<li><a href='http://news.mrdwab.com/2009-07-26/automation-with-openoffice-org-writer/' rel='bookmark' title='Permanent Link: Automation with OpenOffice.org Writer'>Automation with OpenOffice.org Writer</a> <small>Tutorial Level: Intermediate At the Tata-Dhan Academy where I work,...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-07-30/pivot-tables/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting data into R</title>
		<link>http://news.mrdwab.com/2010-07-11/getting-data-into-r/</link>
		<comments>http://news.mrdwab.com/2010-07-11/getting-data-into-r/#comments</comments>
		<pubDate>Sun, 11 Jul 2010 15:53:15 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[data entry]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=781</guid>
		<description><![CDATA[When you first open R, you&#8217;re greeted with a screen similar to the following: R version 2.10.0 (2009-10-26) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/' rel='bookmark' title='Permanent Link: A little spark for presenting your data'>A little spark for presenting your data</a> <small>For some reason, I&#8217;ve been obsessing over the presentation of...</small></li>
<li><a href='http://news.mrdwab.com/2009-11-15/am-i-inconsistent/' rel='bookmark' title='Permanent Link: Am I inconsistent?'>Am I inconsistent?</a> <small>I won&#8217;t pretend that I don&#8217;t have any illegal software...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>When you first open R, you&#8217;re greeted with a screen similar to the following:</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">R version <span style="color:#800080; font-weight:bold">2.10.0</span> <span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">2009</span>-<span style="color:#800080; font-weight:bold">10</span>-<span style="color:#800080; font-weight:bold">26</span><span style="color:#ff0080; font-weight:bold">)</span>
Copyright <span style="color:#ff0080; font-weight:bold">(</span>C<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#800080; font-weight:bold">2009</span> The R Foundation for Statistical Computing
ISBN <span style="color:#800080; font-weight:bold">3</span>-<span style="color:#800080; font-weight:bold">900051</span>-<span style="color:#800080; font-weight:bold">07</span>-<span style="color:#800080; font-weight:bold">0</span>

R is free software and comes with ABSOLUTELY NO WARRANTY<span style="color:#ff0080; font-weight:bold">.</span>
You are welcome to redistribute it under certain conditions<span style="color:#ff0080; font-weight:bold">.</span>
Type <span style="color:#ff0080; font-weight:bold">'</span>license<span style="color:#ff0080; font-weight:bold">()'</span> or <span style="color:#ff0080; font-weight:bold">'</span>licence<span style="color:#ff0080; font-weight:bold">()'</span> for distribution details<span style="color:#ff0080; font-weight:bold">.</span>

  Natural language support but running in an English locale

R is <span style="color:#bb7977; font-weight:bold">a</span> collaborative project with many contributors<span style="color:#ff0080; font-weight:bold">.</span>
Type <span style="color:#ff0080; font-weight:bold">'</span>contributors<span style="color:#ff0080; font-weight:bold">()'</span> for more information and
<span style="color:#ff0080; font-weight:bold">'</span>citation<span style="color:#ff0080; font-weight:bold">()'</span> on how to cite R or R packages in publications<span style="color:#ff0080; font-weight:bold">.</span>

Type <span style="color:#ff0080; font-weight:bold">'</span>demo<span style="color:#ff0080; font-weight:bold">()'</span> for some demos<span style="color:#ff0080; font-weight:bold">, '</span>help<span style="color:#ff0080; font-weight:bold">()'</span> for on-line help<span style="color:#ff0080; font-weight:bold">,</span> or
<span style="color:#ff0080; font-weight:bold">'</span>help<span style="color:#ff0080; font-weight:bold">.</span>start<span style="color:#ff0080; font-weight:bold">()'</span> for an HTML browser interface to help<span style="color:#ff0080; font-weight:bold">.</span>
Type <span style="color:#ff0080; font-weight:bold">'</span>q<span style="color:#ff0080; font-weight:bold">()'</span> to quit R<span style="color:#ff0080; font-weight:bold">.</span>

<span style="color:#ff0080; font-weight:bold">&gt;</span></pre>
<p>I&#8217;ve been trying to encourage my students to use R for some of their work, but in the process, I sort of forgot that for most people, starting up a program and just being greeted with a command prompt might be somewhat intimidating. So after several of my students indicated that they had downloaded and installed R but had no idea what to do next, I thought I would write about some of the very basic ways to get started. I recognize that for some huge datasets, the suggestions here are not the best, but for me, and for most of my students, the datasets that we would be working with are actually quite small.</p>
<p><span id="more-781"></span></p>
<h2>Part one: Entering your data directly in R</h2>
<p>For really small sets of data or for quick calculations, you might just go ahead and enter your data directly in R. The easiest way to do this is to start by entering each set of data as objects. In the following example, we&#8217;re going to create a data frame (a table in R) with the names of some of my students and the scores they&#8217;ve received on three assignments in a fictional course titled &#8220;Data Analysis with R for NGO Workers&#8221;.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> Student <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;Gajanan&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Hari&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Priya&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Raj&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Shreekanth&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Shreerang&quot;</span><span style="color:#ff0080; font-weight:bold">,</span>
<span style="color:#ff0080; font-weight:bold">+</span> <span style="color:#a68500">&quot;Soni&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Vinay&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> Assignment.1 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">93</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">98</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">90</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">70</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">80</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">82</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">75</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">77</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> Assignment.2 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">90</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">87</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">83</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">88</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">78</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">87</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">79</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">84</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> Assignment.3 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">97</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">92</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">85</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">90</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">77</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">70</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">90</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">93</span><span style="color:#ff0080; font-weight:bold">)</span></pre>
<p>Notice that by entering this information, you don&#8217;t receive any &#8220;acknowledgement&#8221; from R than anything has happened. It just shows you the prompt again. To see the entry, you would need to then type the name of the object that you have created, for instance, &#8220;Student&#8221;, &#8220;Assignment.1&#8243;, &#8220;Assignment.2&#8243;, or &#8220;Assignment.3&#8243; in order to see the values. Notice also that R is case-sensitive. So, typing &#8220;student&#8221; would result in an error since we had entered the name with an upper-case &#8220;S&#8221;. Notice also that if you have not completed your statement (as in the first object we were creating) R adds a little &#8220;+&#8221; at the start of the line to indicate to you that your statement is incomplete.</p>
<p>Here&#8217;s what we get when we type &#8220;Assignment.2&#8243;</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> Assignment.2
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">90 87 83 88 78 87 79 84</span></pre>
<p>The &#8220;[1]&#8221; at the start of the second line above is the index of the first value. Consider the following:</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">);</span> <span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">300</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#800080; font-weight:bold">30</span><span style="color:#ff0080; font-weight:bold">)</span>
 <span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span>  <span style="color:#800080; font-weight:bold">87 236 122 263 279  14 156 262 162 133 278 132 196 165  30 257  70  12  93</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">20</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">269 250 194 179 276 181 195 150 163  79  40</span></pre>
<p>In this case, the number &#8220;269&#8243; is the twentieth number in this list of numbers. (See <a href="http://news.mrdwab.com/2009-11-29/simple-sampling-with-r/" title="Simple sampling with R">Simple sampling with R</a> and <a href="http://news.mrdwab.com/2009-11-30/sampling-with-replacement-in-r/" title="Sampling with replacement in R">Sampling with replacement in R</a> for a basic introduction to sampling.) Knowing the index of a number can be useful when you need to know the position of a certain value since occasionally, you want to select just a single value from a vector. Type the following and compare it to what you got when you typed &#8220;Assignment.2&#8243;:</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> Assignment.2<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">]</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">83</span></pre>
<p>R has returned the third value from the object you created.</p>
<p>If you want to create a simple table of these four sets of data you&#8217;ve created, you use the &#8220;data.frame&#8221; function. The first line below creates a data frame called &#8220;R.For.NGOs&#8221; and the second one tells R to display it. The third line opens up R&#8217;s built-in spreadsheet, which I generally don&#8217;t use except to quickly scan data.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> R.For.NGOs <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">data.frame</span><span style="color:#ff0080; font-weight:bold">(</span>Student<span style="color:#ff0080; font-weight:bold">,</span> Assignment.1<span style="color:#ff0080; font-weight:bold">,</span> Assignment.2<span style="color:#ff0080; font-weight:bold">,</span> Assignment.3<span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> R.For.NGOs
     Student Assignment.1 Assignment.2 Assignment.3
<span style="color:#800080; font-weight:bold">1</span>    Gajanan           <span style="color:#800080; font-weight:bold">93           90           97</span>
<span style="color:#800080; font-weight:bold">2</span>       Hari           <span style="color:#800080; font-weight:bold">98           87           92</span>
<span style="color:#800080; font-weight:bold">3</span>      Priya           <span style="color:#800080; font-weight:bold">90           83           85</span>
<span style="color:#800080; font-weight:bold">4</span>        Raj           <span style="color:#800080; font-weight:bold">70           88           90</span>
<span style="color:#800080; font-weight:bold">5</span> Shreekanth           <span style="color:#800080; font-weight:bold">80           78           77</span>
<span style="color:#800080; font-weight:bold">6</span>  Shreerang           <span style="color:#800080; font-weight:bold">82           87           70</span>
<span style="color:#800080; font-weight:bold">7</span>       Soni           <span style="color:#800080; font-weight:bold">75           79           90</span>
<span style="color:#800080; font-weight:bold">8</span>      Vinay           <span style="color:#800080; font-weight:bold">77           84           93</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">fix</span><span style="color:#ff0080; font-weight:bold">(</span>R.For.NGOs<span style="color:#ff0080; font-weight:bold">)</span></pre>
<p>Now, try using the index feature and see what happens.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> R.For.NGOs<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">]</span>
  Assignment.2
<span style="color:#800080; font-weight:bold">1           90</span>
<span style="color:#800080; font-weight:bold">2           87</span>
<span style="color:#800080; font-weight:bold">3           83</span>
<span style="color:#800080; font-weight:bold">4           88</span>
<span style="color:#800080; font-weight:bold">5           78</span>
<span style="color:#800080; font-weight:bold">6           87</span>
<span style="color:#800080; font-weight:bold">7           79</span>
<span style="color:#800080; font-weight:bold">8           84</span></pre>
<p>This is probably not what you expected, right? R has returned just the third column. Once data is in a table or a matrix, R needs both a column and a row index to return a specific value. Let&#8217;s say we wanted Gajanan&#8217;s score for the third assignment. For this, Assignment 3 is the fourth column, and Gajanan is the first row, so we need to reference the index &#8220;[1,4]&#8220;. If we wanted only Priya&#8217;s scores, she&#8217;s the third row, so we would need to reference the index &#8220;[,3]&#8220;. (Note that you do not write &#8220;[0,3]&#8220;.)</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> R.For.NGOs<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">4</span><span style="color:#ff0080; font-weight:bold">]</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">97</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> R.For.NGOs<span style="color:#ff0080; font-weight:bold">[,</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">]</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">90 87 83 88 78 87 79 84</span></pre>
<h2>Part 2: Using a spreadsheet for data entry</h2>
<p>As you can see above, it is not too difficult to create your data right in R. However, if you had a dataset that has a lot of records (like the one I used in <a href="http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/" title="Quickly reshaping data from "wide" to "long" formats in R">Quickly reshaping data from </a>), you would be silly to use R. For such data, it makes much more sense to use something like OpenOffice.org Calc or Microsoft Excel or some other spreadsheet interface. For starters, you would be more comfortable with the interface. Additionally, there may be more opportunities to quickly check your data over for errors, and you might even be able to set up data-entry rules to prevent incorrect values from being entered.</p>
<p>How do you get your data into R if it&#8217;s in an Excel file or another spreadsheet? There are several ways. The most common one I use is to just save my data as a comma separated value (CSV) file and open that in R. The second most common approach I use is to copy the data and use R&#8217;s &#8220;clipboard&#8221; feature to get the data into R. Here&#8217;s how you&#8217;d proceed for each of these approaches. I&#8217;ll assume that you&#8217;ve used &#8220;File > Change dir&#8230;&#8221; to have R working out of your &#8220;My Documents&#8221; folder and that your CSV file is saved in that folder. Here, I&#8217;m going to create an object in R called &#8220;Book.Sales&#8221; using a file called &#8220;BookSales.csv&#8221; stored in my &#8220;My Documents&#8221; folder. This file has the data starting on the second row; the first row contains the column names.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> Book.Sales <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.csv</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;BookSales.csv&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> header<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">)</span></pre>
<p>If you don&#8217;t want to change the working directory, you can also enter the full path to the file. For example &#8220;c:\\data\\file.csv&#8221; or &#8220;c:/data/file.csv&#8221; can be used to access a file called &#8220;file.csv&#8221; in a folder named &#8220;data&#8221; on drive C. Also, if you know the URL of a CSV file, you can access the file directly by typing the URL in place of the file name. This is the option I usually use for my online examples.</p>
<blockquote><p>Some advice: do not use spreadsheets with merged cells and lots of blank cells at the top. Instead, create a new CSV file where the first row contains the column names and the data starts on the next line. That makes putting your data into R very easy&#8230;.</p></blockquote>
<p>If you prefer to go the &#8220;cut-and-paste&#8221; way, just open up your spreadsheet, copy the cells you&#8217;re interested in, and type the following (we&#8217;ll assume it is the same dataset):</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> Book.Sales <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.delim</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;clipboard&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> header<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">)</span></pre>
<p>The main difference between the read.table and read.csv options is that the CSV option looks for a comma separating each value, while the read.delim looks for a tab character.</p>
<p>Once you&#8217;ve overcome the hurdle of getting data into R, playing with your data will be much more fun!</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/' rel='bookmark' title='Permanent Link: A little spark for presenting your data'>A little spark for presenting your data</a> <small>For some reason, I&#8217;ve been obsessing over the presentation of...</small></li>
<li><a href='http://news.mrdwab.com/2009-11-15/am-i-inconsistent/' rel='bookmark' title='Permanent Link: Am I inconsistent?'>Am I inconsistent?</a> <small>I won&#8217;t pretend that I don&#8217;t have any illegal software...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-07-11/getting-data-into-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The awesomeness that is WordPress and child themes</title>
		<link>http://news.mrdwab.com/2010-07-08/the-awesomeness-that-is-wordpress-and-child-themes/</link>
		<comments>http://news.mrdwab.com/2010-07-08/the-awesomeness-that-is-wordpress-and-child-themes/#comments</comments>
		<pubDate>Thu, 08 Jul 2010 16:10:46 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[@font-face]]></category>
		<category><![CDATA[child themes]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[themes]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=774</guid>
		<description><![CDATA[After a long long long time, I decided to get back into some actual website experimentation. For a while, I was really lazy with my site design and settled on using the Atahualpa theme. This was a great be-lazy theme, but I noticed that while browsing other sites, it often became too obvious which sites [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2008-06-23/2657-productions-facelift/' rel='bookmark' title='Permanent Link: 2657 Productions is going to get a facelift'>2657 Productions is going to get a facelift</a> <small>I&#8217;ve been trying to keep myself busy recently. As I...</small></li>
<li><a href='http://news.mrdwab.com/2007-03-02/nested-lists-and-css/' rel='bookmark' title='Permanent Link: Nested Lists and CSS'>Nested Lists and CSS</a> <small>I don&#8217;t use lists too often for my own stuff,...</small></li>
<li><a href='http://news.mrdwab.com/2008-06-20/damn-it/' rel='bookmark' title='Permanent Link: DAMN It!'>DAMN It!</a> <small>I was just having some fun at work a couple...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>After a long long long time, I decided to get back into some actual website experimentation. For a while, I was really lazy with my site design and settled on using the <a href="http://wordpress.org/extend/themes/atahualpa" target="_blank">Atahualpa</a> theme. This was a great be-lazy theme, but I noticed that while browsing other sites, it often became too obvious which sites were made using the theme and me, always wanting &#8220;my own thing&#8221; decided that it was time to get my hands at least a little bit dirty again.</p>

<a href='http://news.mrdwab.com/2010-07-08/the-awesomeness-that-is-wordpress-and-child-themes/screen-3/' title='screen-3'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/07/screen-3-150x150.jpg" class="attachment-thumbnail" alt="screen-3" title="screen-3" /></a>
<a href='http://news.mrdwab.com/2010-07-08/the-awesomeness-that-is-wordpress-and-child-themes/screen-1/' title='screen-1'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/07/screen-1-150x150.jpg" class="attachment-thumbnail" alt="screen-1" title="screen-1" /></a>
<a href='http://news.mrdwab.com/2010-07-08/the-awesomeness-that-is-wordpress-and-child-themes/screen-2/' title='screen-2'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/07/screen-2-150x150.jpg" class="attachment-thumbnail" alt="screen-2" title="screen-2" /></a>

<p><span id="more-774"></span></p>
<p>I had posted a little while ago about <a href="http://news.mrdwab.com/2010-05-20/font-experiments/http://news.mrdwab.com/2010-05-20/font-experiments/">my excitement</a> and <a href="http://news.mrdwab.com/2010-05-20/font-experiments-part-2/">my disappointment</a> over the <a href="http://code.google.com/webfonts" target="_blank">Google Font Directory</a> and that had inspired me to look into other ways to get some custom fonts online. <a href="http://www.fontsquirrel.com/fontface/generator" target="_blank">Font Squirrel</a> offered a great free @font-face generator, so I loaded up my <a href="http://thegrumpywriter.wordpress.com">Mr. Grumpy</a> <a href="http://thegrumpywriter.wordpress.com/fonts/">fonts</a> and decided to see what I could do with it.</p>
<p>With my fonts ready, I decided to get back into my own theme design. This time, rather than designing the theme from scratch, I decided to create a child theme for one of the better theme frameworks I found: <a href="http://wordpress.org/extend/themes/thematic" target="_blank">Thematic</a>.</p>
<p>For those of you who don&#8217;t know what a child theme is, think about it this way. WordPress periodically needs to be updated (to fix bugs, add new features, enhance security, and so on). When you host your own WordPress installation, you&#8217;re also tempted (at least nerdy people like me are tempted) to customize the themes they use, and in the past, that often meant making edits to the original theme files. But just as WordPress needs to be periodically updated, so to does its themes (for example, when WordPress added widget support or post thumbnail support, theme modifications were required). When you updated your theme, however, all of your customizations would them be lost. </p>
<p>Child themes solve this problem by letting you use a well-structured theme that is up-to-date with all the  WordPress features you need, creating a new folder in your WordPress themes directory for your child theme, and creating a new theme by doing as little as creating a new stylesheet that imports the styles from your chosen template. If you need to make more significant changes to the parent theme, instead of modifying the actual theme files, you create a &#8220;functions.php&#8221; file in your child theme directory. So, for instance, if you&#8217;ve decided that you don&#8217;t want the &#8220;access&#8221; div that&#8217;s present in the Thematic framework, you can remove it by adding the following to your functions.php file:</p>
<pre class="brush: php;">
// Remove Thematic access
function remove_access() {
remove_action('thematic_header','thematic_access',9);
}
add_action('init','remove_access');
</pre>
<p>Or, if you want to add something to the head of your page (like some extra javascript or something) you can add something like the following:</p>
<pre class="brush: php;">
function yourfunctionname() { ?&gt;
&lt;script type=&quot;text/javascript&quot;&gt;
Your Fancy javascript
&lt;/script&gt;
&lt;?php }
add_action('wp_head', 'yourfunctionname');
</pre>
<p>In the above snippet, the first line starts with naming our function, and ends with { ?> which switches us out of PHP so we can add normal HTML. The last line is where we make the call to the function that we just defined. There are some good instructions at the <a href="http://themeshaper.com/thematic/guide/" target="_blank">Thematic Theme Framework Guide</a> for how to use the different Thematic hooks and filters to create a truly customized site. </p>
<p>As I was making these changes, <a href="http://alistapart.com" target="_blank">A List Apart</a> published <a href="http://www.alistapart.com/articles/supersize-that-background-please/" target="_blank">a great article about full-screen backgrounds</a> and, of course, I had to try it out. But, I wanted to take it a little bit further: make the background change with each page. So, for that, I used <a href="http://ma.tt/scripts/randomimage/" target="_blank">Matt&#8217;s random image script</a> and put that into my stylesheet. This is something I might change though. It seems like referencing the PHP file in the CSS works as I expected with Opera and Firefox (a different background image for each page) but with Chrome, it loads a different backround image only once each session (which is also OK, but not what I was looking for). (From A List Apart, I also borrowed ideas for using CSS transparency and rounded corners, as well as their ideas for applying different CSS rules for different window sizes.) </p>
<p>Here are a few snippets of the CSS customization.</p>
<p>For the transparency and rounded corners for the header area and the main text area:</p>
<pre class="brush: css;">
#branding
{
	margin-top: 20px;
}
#main, #branding
{
	background-color: #F8F8F8;
	border-radius: 12px;
	margin-bottom: 20px;
	-moz-border-radius: 12px;
	-webkit-border-radius: 12px;
	filter: alpha(opacity=90);
	-khtml-opacity: 0.9;
	-moz-opacity: 0.9;
	opacity: 0.9;
}
</pre>
<p>For centering the page if the screen is small and floating the content to the right if the page is wider than 1280 pixels:</p>
<pre class="brush: css;">
@media all and (min-width: 1280px)
{
	#branding, #main
	{
		float: right;
		margin-right: 20px;
	}
}
@media only all and (max-width: 1024px) and (max-height: 768px)
{
	body
	{
		background-size: 1024px 768px;
		-moz-background-size: 1024px 768px;
	}
}
</pre>
<p>For getting a random image each time the page is rotated. Instead of mentioning an image URL like you normally would, you instead mention the rotate image script I referenced earlier.</p>
<pre class="brush: css;">
body
{
	background: #fff url(rotatebg.php) left bottom fixed no-repeat;
	background-size: cover;
	color: #000;
	margin: 0;
	-moz-background-size: cover;
	padding: 0;
}
</pre>
<p>Maybe eventually, when I get around to tweaking this a bit more and testing it in some other browsers (I&#8217;m not even going to bother being nice to Internet Explorer) I&#8217;ll package up the child theme and share it&#8230;.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2008-06-23/2657-productions-facelift/' rel='bookmark' title='Permanent Link: 2657 Productions is going to get a facelift'>2657 Productions is going to get a facelift</a> <small>I&#8217;ve been trying to keep myself busy recently. As I...</small></li>
<li><a href='http://news.mrdwab.com/2007-03-02/nested-lists-and-css/' rel='bookmark' title='Permanent Link: Nested Lists and CSS'>Nested Lists and CSS</a> <small>I don&#8217;t use lists too often for my own stuff,...</small></li>
<li><a href='http://news.mrdwab.com/2008-06-20/damn-it/' rel='bookmark' title='Permanent Link: DAMN It!'>DAMN It!</a> <small>I was just having some fun at work a couple...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-07-08/the-awesomeness-that-is-wordpress-and-child-themes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>R is like a giant calculator for grownups</title>
		<link>http://news.mrdwab.com/2010-06-30/r-is-like-a-giant-calculator-for-grownups/</link>
		<comments>http://news.mrdwab.com/2010-06-30/r-is-like-a-giant-calculator-for-grownups/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 07:04:28 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[basic calculations]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=762</guid>
		<description><![CDATA[One of the things that is interesting about R is how flexible it is. One of the fun things about it is how interactive it can be. While my examples so far have been a little bit more involved, it can be useful to spend some time just getting acquainted with how R performs basic [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2009-11-29/simple-sampling-with-r/' rel='bookmark' title='Permanent Link: Simple sampling with R'>Simple sampling with R</a> <small>I mentioned in an earlier post (Am I inconsistent?) that...</small></li>
<li><a href='http://news.mrdwab.com/2009-11-30/sampling-with-replacement-in-r/' rel='bookmark' title='Permanent Link: Sampling with replacement in R'>Sampling with replacement in R</a> <small>In my last post about sampling (Simple sampling with R)...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>One of the things that is interesting about R is how flexible it is. One of the fun things about it is how interactive it can be. While my examples so far <a href="http://news.mrdwab.com/2009-11-29/simple-sampling-with-r/">have been</a> <a href="http://news.mrdwab.com/2009-11-30/sampling-with-replacement-in-r/">a little</a> <a href="http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/">bit</a> <a href="http://news.mrdwab.com/2010-05-16/choropleth-party-with-r/">more</a> <a href="http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/">involved</a>, it can be useful to spend some time just getting acquainted with how R performs basic calculations. In fact, I sometimes like to think of R as a giant calculator for grownups to play with. The following syntax snippets show how you can perform basic calculations with R. This is by no means complete, but it should provide a reasonable introduction to someone <em>just getting started</em> with R. (Experienced R users would find this TOTALLY useless&#8230;.)</p>
<p><span id="more-762"></span></p>
<p>Let&#8217;s get started with really basic stuff&#8230;. (By the way, everything in the syntax following the &#8220;#&#8221; is treated as a comment and is not processed by R. They are just there to explain what&#8217;s going on at each step for the uninitiated&#8230;.)</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">+</span><span style="color:#800080; font-weight:bold">2</span> <span style="color:#f27900"># addition</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">4</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">-</span><span style="color:#800080; font-weight:bold">2</span> <span style="color:#f27900"># subtraction</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">0</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">*</span><span style="color:#800080; font-weight:bold">3</span> <span style="color:#f27900"># multiplication</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">6</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">/</span><span style="color:#800080; font-weight:bold">2</span> <span style="color:#f27900"># division</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">1</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">2</span>^<span style="color:#800080; font-weight:bold">3</span> <span style="color:#f27900"># exponents</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">8</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">sqrt</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">16</span><span style="color:#ff0080; font-weight:bold">)/</span><span style="color:#800080; font-weight:bold">2</span> <span style="color:#f27900"># square root</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">2</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">16</span>^<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">/</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># square root (as exponent)</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">4</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#800080; font-weight:bold">8</span>^<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">/</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># cubed root (you get the idea)</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">2</span></pre>
<p>Creating objects in R can make it convenient to perform calculations.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> aa <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">8</span>^<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">/</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># assigning a variable with the name &quot;aa&quot;</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> aa^<span style="color:#800080; font-weight:bold">2</span> <span style="color:#f27900"># aa squared</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">4</span>
<span style="color:#ff0080; font-weight:bold">&gt; (</span><span style="color:#800080; font-weight:bold">8</span>^<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">/</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">))</span>^<span style="color:#800080; font-weight:bold">2</span> <span style="color:#f27900"># same as &quot;aa^2&quot; above</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">4</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> bb <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">4</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># an object with the numbers 1 to 5, named &quot;bb&quot;</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> cc <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">10</span> <span style="color:#f27900"># a single object with the value &quot;10&quot;</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> bb<span style="color:#ff0080; font-weight:bold">/</span>cc <span style="color:#f27900"># doing basic math on the objects</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">0.1 0.2 0.3 0.4 0.5</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> bb<span style="color:#ff0080; font-weight:bold">+</span>cc <span style="color:#f27900"># more basic math</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">11 12 13 14 15</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> dd <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">25</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">21</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># another way to create a series of numbers (descending)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> dd<span style="color:#ff0080; font-weight:bold">-</span>bb <span style="color:#f27900"># subtracts 1 from 25, 2 from 24, and so on</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">24 22 20 18 16</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">sum</span><span style="color:#ff0080; font-weight:bold">(</span>dd<span style="color:#ff0080; font-weight:bold">-</span>bb<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># &quot;sum&quot; of the resulting values of the expression &quot;dd-bb&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">100</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">mean</span><span style="color:#ff0080; font-weight:bold">(</span>dd<span style="color:#ff0080; font-weight:bold">-</span>bb<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># average of the resulting values of the expression &quot;dd-bb&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">20</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> ddd <span style="color:#ff0080; font-weight:bold">=</span> dd<span style="color:#ff0080; font-weight:bold">-</span>bb <span style="color:#f27900"># you can even create *another* object....</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">mean</span><span style="color:#ff0080; font-weight:bold">(</span>ddd<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># you get the idea...</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">20</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">summary</span><span style="color:#ff0080; font-weight:bold">(</span>ddd<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># convenient summary statistics about a distribution</span>
   Min. <span style="color:#800080; font-weight:bold">1</span>st Qu.  Median    Mean <span style="color:#800080; font-weight:bold">3</span>rd Qu.    Max.
     <span style="color:#800080; font-weight:bold">16      18      20      20      22      24</span></pre>
<p>You can even set up your own equations as &#8220;functions&#8221; in R.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> area.circ <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#bb7977; font-weight:bold">function</span><span style="color:#ff0080; font-weight:bold">(</span>r<span style="color:#ff0080; font-weight:bold">) {</span><span style="color:#0080c0">pi</span><span style="color:#ff0080; font-weight:bold">*</span>r^<span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">}</span> <span style="color:#f27900"># solving simple equations</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> area.circ<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># calculating for a radius of 1...</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">3.141593</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> area.circ<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># ... and a radius of 2</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">12.56637</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> y <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#bb7977; font-weight:bold">function</span><span style="color:#ff0080; font-weight:bold">(</span>m<span style="color:#ff0080; font-weight:bold">,</span>x<span style="color:#ff0080; font-weight:bold">,</span>b<span style="color:#ff0080; font-weight:bold">) (</span>m<span style="color:#ff0080; font-weight:bold">*</span>x<span style="color:#ff0080; font-weight:bold">+</span>b<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># more simple equations... set up the function</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> y<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">12</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># and feed R the values that you want to use</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">26</span></pre>
<p>A table (a data frame, really) is an example of an object in R. Below, we create two tables, each with 5 rows and 3 columns. Then, we perform some basic calculations on the tables.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> var.1 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># create a vector with the numbers 1 to 5</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># set a seed so you can replicate this</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> var.2 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">300</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">))</span> <span style="color:#f27900"># randomly sample 5 numbers from 300</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># ...</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> var.3 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">500</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">))</span> <span style="color:#f27900"># randomly sample 5 numbers from 500</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> var.1 <span style="color:#ff0080; font-weight:bold">+</span> var.2 <span style="color:#ff0080; font-weight:bold">+</span> var.3 <span style="color:#f27900"># do some basic mathematics</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">232 632 329 706 751</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">data.frame</span><span style="color:#ff0080; font-weight:bold">(</span>var.1<span style="color:#ff0080; font-weight:bold">,</span> var.2<span style="color:#ff0080; font-weight:bold">,</span> var.3<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># put your objects into a table</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1 <span style="color:#f27900"># view your table</span>
  var.1 var.2 var.3
<span style="color:#800080; font-weight:bold">1     1    87   144</span>
<span style="color:#800080; font-weight:bold">2     2   236   394</span>
<span style="color:#800080; font-weight:bold">3     3   122   204</span>
<span style="color:#800080; font-weight:bold">4     4   263   439</span>
<span style="color:#800080; font-weight:bold">5     5   279   467</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1<span style="color:#ff0080; font-weight:bold">+</span><span style="color:#800080; font-weight:bold">10</span> <span style="color:#f27900"># add 10 to every item in your table</span>
  var.1 var.2 var.3
<span style="color:#800080; font-weight:bold">1    11    97   154</span>
<span style="color:#800080; font-weight:bold">2    12   246   404</span>
<span style="color:#800080; font-weight:bold">3    13   132   214</span>
<span style="color:#800080; font-weight:bold">4    14   273   449</span>
<span style="color:#800080; font-weight:bold">5    15   289   477</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1<span style="color:#ff0080; font-weight:bold">*</span>var.1 <span style="color:#f27900"># multiply your table by var.1</span>
  var.1 var.2 var.3
<span style="color:#800080; font-weight:bold">1     1    87   144</span>
<span style="color:#800080; font-weight:bold">2     4   472   788</span>
<span style="color:#800080; font-weight:bold">3     9   366   612</span>
<span style="color:#800080; font-weight:bold">4    16  1052  1756</span>
<span style="color:#800080; font-weight:bold">5    25  1395  2335</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#f27900"># notice that R goes from top to bottom and loops.</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1<span style="color:#ff0080; font-weight:bold">*</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">3</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># Try the same thing with multiplying by c(1:3))</span>
  var.1 var.2 var.3
<span style="color:#800080; font-weight:bold">1     1   261   288</span>
<span style="color:#800080; font-weight:bold">2     4   236  1182</span>
<span style="color:#800080; font-weight:bold">3     9   244   204</span>
<span style="color:#800080; font-weight:bold">4     4   789   878</span>
<span style="color:#800080; font-weight:bold">5    10   279  1401</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> var.4 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">6</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">10</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># set up some new variables</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> var.5 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">100</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">))</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">set.seed</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">123</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> var.6 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">sample</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">200</span><span style="color:#ff0080; font-weight:bold">,</span><span style="color:#800080; font-weight:bold">5</span><span style="color:#ff0080; font-weight:bold">))</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.2 <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">data.frame</span><span style="color:#ff0080; font-weight:bold">(</span>var.4<span style="color:#ff0080; font-weight:bold">,</span> var.5<span style="color:#ff0080; font-weight:bold">,</span> var.6<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># and create a second table</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.2 <span style="color:#f27900"># view your table</span>
  var.4 var.5 var.6
<span style="color:#800080; font-weight:bold">1     6    29    58</span>
<span style="color:#800080; font-weight:bold">2     7    79   157</span>
<span style="color:#800080; font-weight:bold">3     8    41    81</span>
<span style="color:#800080; font-weight:bold">4     9    86   174</span>
<span style="color:#800080; font-weight:bold">5    10    91   185</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1<span style="color:#ff0080; font-weight:bold">+</span>table.2 <span style="color:#f27900"># add table.1 and table.2 together</span>
  var.1 var.2 var.3
<span style="color:#800080; font-weight:bold">1     7   116   202</span>
<span style="color:#800080; font-weight:bold">2     9   315   551</span>
<span style="color:#800080; font-weight:bold">3    11   163   285</span>
<span style="color:#800080; font-weight:bold">4    13   349   613</span>
<span style="color:#800080; font-weight:bold">5    15   370   652</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> table.1<span style="color:#ff0080; font-weight:bold">*</span>table.2 <span style="color:#f27900"># multiply table.1 and table.2 together</span>
  var.1 var.2 var.3
<span style="color:#800080; font-weight:bold">1     6  2523  8352</span>
<span style="color:#800080; font-weight:bold">2    14 18644 61858</span>
<span style="color:#800080; font-weight:bold">3    24  5002 16524</span>
<span style="color:#800080; font-weight:bold">4    36 22618 76386</span>
<span style="color:#800080; font-weight:bold">5    50 25389 86395</span></pre>
<p>Creating a matrix by multiplying vectors is also very easy.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> aa <span style="color:#ff0080; font-weight:bold">=</span> var.1 <span style="color:#ff0080; font-weight:bold">%</span>o<span style="color:#ff0080; font-weight:bold">%</span> var.4<span style="color:#ff0080; font-weight:bold">;</span> <span style="color:#f27900"># create a matrix multiplying var.1 by var.4</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">row.names</span><span style="color:#ff0080; font-weight:bold">(</span>aa<span style="color:#ff0080; font-weight:bold">) =</span> var.1<span style="color:#ff0080; font-weight:bold">;</span> <span style="color:#0080c0">colnames</span><span style="color:#ff0080; font-weight:bold">(</span>aa<span style="color:#ff0080; font-weight:bold">) =</span> var.4 <span style="color:#f27900"># set your row and column names</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> aa <span style="color:#f27900"># view your output. It's a nice multiplication table!</span>
   <span style="color:#800080; font-weight:bold">6  7  8  9 10</span>
<span style="color:#800080; font-weight:bold">1  6  7  8  9 10</span>
<span style="color:#800080; font-weight:bold">2 12 14 16 18 20</span>
<span style="color:#800080; font-weight:bold">3 18 21 24 27 30</span>
<span style="color:#800080; font-weight:bold">4 24 28 32 36 40</span>
<span style="color:#800080; font-weight:bold">5 30 35 40 45 50</span></pre>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2009-11-29/simple-sampling-with-r/' rel='bookmark' title='Permanent Link: Simple sampling with R'>Simple sampling with R</a> <small>I mentioned in an earlier post (Am I inconsistent?) that...</small></li>
<li><a href='http://news.mrdwab.com/2009-11-30/sampling-with-replacement-in-r/' rel='bookmark' title='Permanent Link: Sampling with replacement in R'>Sampling with replacement in R</a> <small>In my last post about sampling (Simple sampling with R)...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-06-30/r-is-like-a-giant-calculator-for-grownups/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A little spark for presenting your data</title>
		<link>http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/</link>
		<comments>http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/#comments</comments>
		<pubDate>Thu, 17 Jun 2010 16:18:16 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[OpenOffice.org]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[Google Charts]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[sparklines]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=722</guid>
		<description><![CDATA[For some reason, I&#8217;ve been obsessing over the presentation of data. (Either it is that I&#8217;ve just read all of Edward Tufte&#8216;s books, or I&#8217;m just being a nerd. But I guess that those two things aren&#8217;t exactly exclusive&#8230;.) Considering my obsession, you could imagine how I felt when one of my students stood up [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
<li><a href='http://news.mrdwab.com/2010-05-16/choropleth-party-with-r/' rel='bookmark' title='Permanent Link: It&#8217;s a choropleth party with R, and everyone&#8217;s invited'>It&#8217;s a choropleth party with R, and everyone&#8217;s invited</a> <small>Map party time. For some reason this happens every once...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>For some reason, I&#8217;ve been obsessing over the presentation of data. (Either it is that I&#8217;ve just read all of <a href="http://www.edwardtufte.com/tufte/" target="_blank">Edward Tufte</a>&#8216;s books, or I&#8217;m just being a nerd. But I guess that those two things aren&#8217;t exactly exclusive&#8230;.) Considering my obsession, you could imagine how I felt when one of my students stood up and made a presentation that included the following slides, along with the typical, &#8220;As you can see here, the production of rice has been decreasing. And as you can see in this chart, the production of wheat has been decreasing,&#8221; for slide after slide after slide.</p>
<p><center><iframe src="http://docs.google.com/viewer?url=http%3A%2F%2Fdl.dropbox.com%2Fu%2F2556524%2FRaj%2527s%2520Data%2520%2528separate%2520slides%2529.pdf&#038;embedded=true" width="500" height="400" style="border: none;"></iframe></center><br />
<em>If for some reason you&#8217;re not able to see the embedded slides, you can also <a href="http://docs.google.com/viewer?url=http://dl.dropbox.com/u/2556524/Raj%2527s%2520Data%2520%2528separate%2520slides%2529.pdf" target="_blank">view the slides in a new window</a>.</em></p>
<p><span id="more-722"></span></p>
<p>For me, there are several problems with this. First, I can&#8217;t really compare the first slide with, say, the eight, because I&#8217;m not given enough time to do so. Second, if the main point is to just talk about &#8220;increasing&#8221; and &#8220;decreasing&#8221;, are this many slides necessary? Third, the axes on the charts aren&#8217;t the same, making comparisons more difficult. Oh, and <a href="http://news.mrdwab.com/wp-content/uploads/2010/06/Rajs-Data-slides.jpg" rel="lightbox[722]">printing out all of your slides as a handout</a> doesn&#8217;t help either.</p>
<p>For something like this, <a href="http://en.wikipedia.org/wiki/Sparkline" target="_blank">sparklines</a>&#8211;one of the many interesting ideas that Tufte suggests&#8211;might be a solution, and they fit in well with my advice to my students that they should prepare presentation &#8220;fact sheets&#8221; or something similar rather than prepare slide after slide in PowerPoint. So, I thought I should figure out what my options are for creating them (short of downloading an illegal copy of Microsoft Office 2010, which is supposed to have sparklines built into the charting options).</p>
<p>It turns out that there are several options for making sparklines, whether you are using <a href="http://www.multiracio.com/eurooffice/products/eurooffice-sparkline" target="_blank">OpenOffice.org</a> or <a href="http://sparklines-excel.blogspot.com/" target="_blank">Microsoft Office</a>, or, for that matter, preparing data for presentation online. And, since it&#8217;s pretty easy to figure out the offline options, I thought I would try out the Google Chart application programming interface (API) to see what I could do with it.</p>
<p>The <a href="http://code.google.com/apis/visualization/documentation/gallery/imagesparkline.html" target="_blank">construct</a> is pretty basic. You have some code that looks like &#8220;data.addColumn(&#8220;number&#8221;, &#8220;Revenue&#8221;);&#8221; representing all of your &#8220;columns&#8221; of data and the data for each column is represented in an array like &#8220;data.setValue(0,0,435);&#8221; where the first number is the position on the x-axis for the item you&#8217;re charting, the second number is the variable you&#8217;re charting (since you might have several), and the third number is the value of the variable at that position. </p>
<p>Here&#8217;s the problem, though. The format that my data is in looks like this:</p>
<p><center><iframe width='500' height='300' frameborder='0' src='https://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdFhtdmZtLVBfbjFWeS16bTJNZm5ZTHc&#038;hl=en&#038;single=true&#038;gid=0&#038;output=html&#038;widget=true'></iframe></center></p>
<p>To present my data using the Google Chart API would require a lot of annoying cutting and pasting.</p>
<p>Or would it?</p>
<p>Of course we could just make our lives easier by using <a href="http://www.r-project.org" target="_blank">R</a> to prepare our data, and here&#8217;s how.</p>
<p>First, load the data (using &#8220;read.csv&#8221; and creating a data frame), create a new object in R with the column names (we&#8217;re lazy, right, and we don&#8217;t want to type any more than we have to). Also, convert the data frame that you created into a matrix, and convert the values to numeric values. If this sounds complicated, it&#8217;s not. It&#8217;s just the following few lines of code:</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> crop.prod <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.csv</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;http://news.mrdwab.com/cropproduction&quot;</span><span style="color:#ff0080; font-weight:bold">,</span>
<span style="color:#ff0080; font-weight:bold">+</span>                      header <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">row.names</span> <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> crop.prod.names <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">names</span><span style="color:#ff0080; font-weight:bold">(</span>crop.prod<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># object with the column names</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> crop.prod.num <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">as.numeric</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">as.matrix</span><span style="color:#ff0080; font-weight:bold">(</span>crop.prod<span style="color:#ff0080; font-weight:bold">)))</span> <span style="color:#f27900"># data values</span>
</pre>
<p>Next, we want to set things up so that we can &#8220;paste&#8221; our data together in a form that the Google Chart API can process.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> prefix.column <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#a68500">&quot;data.addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">&quot;</span> <span style="color:#f27900"># ugly, I know</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> prefix.value <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#a68500">&quot;data.setValue(&quot;</span> <span style="color:#f27900"># but we'll clean it up later</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> end.column <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#a68500">&quot;</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span> <span style="color:#f27900"># to be pasted at the end of each line of column names</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> end.value <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#a68500">&quot;);&quot;</span> <span style="color:#f27900"># to be pasted at the end of each line of data</span>
</pre>
<p>R has this great function where you can paste things together. Well use that function to get our data in a nicer format. We&#8217;ll still have to clean it up a little bit (mostly removing extra commas and quotation marks) but that&#8217;s a simple search-and-replace procedure.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#f27900"># pasting together the column names</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> column.names <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">paste</span><span style="color:#ff0080; font-weight:bold">(</span>prefix.column<span style="color:#ff0080; font-weight:bold">,</span>crop.prod.names<span style="color:#ff0080; font-weight:bold">,</span>end.column<span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#f27900"># pasting together each measurement of data</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> crop.prod.data <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">paste</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">0</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">10</span><span style="color:#ff0080; font-weight:bold">),</span><span style="color:#0080c0">sort</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">rep</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">0</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">7</span><span style="color:#ff0080; font-weight:bold">),</span><span style="color:#800080; font-weight:bold">11</span><span style="color:#ff0080; font-weight:bold">)),</span>crop.prod.num<span style="color:#ff0080; font-weight:bold">,</span> sep<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;,&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> crop.prod.data <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">paste</span><span style="color:#ff0080; font-weight:bold">(</span>prefix.value<span style="color:#ff0080; font-weight:bold">,</span>crop.prod.data<span style="color:#ff0080; font-weight:bold">,</span>end.value<span style="color:#ff0080; font-weight:bold">)</span>
</pre>
<p>The most complicated line above is the <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">crop.prod.data <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">paste</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">0</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">10</span><span style="color:#ff0080; font-weight:bold">),</span><span style="color:#0080c0">sort</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">rep</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">0</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">7</span><span style="color:#ff0080; font-weight:bold">),</span><span style="color:#800080; font-weight:bold">11</span><span style="color:#ff0080; font-weight:bold">)),</span>crop.prod.num<span style="color:#ff0080; font-weight:bold">,</span> sep<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;,&quot;</span><span style="color:#ff0080; font-weight:bold">)</span></span> line, but even that is not too difficult to follow. &#8220;crop.prod.data&#8221; is the name of our object. That object comprises three values, each separated by a comma. The first value is the digits 0 to 10, (eleven values overall) looped for as long as required. The second value is eleven 0s, eleven 1s, eleven 2s and so on. The third value is the array from the &#8220;crop.prod.num&#8221; object we had created earlier.</p>
<p>At this point, we&#8217;re pretty much done. We just need to clean things up, and replace the contents of Google&#8217;s example page with our own data. Using <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#0080c0">fix</span><span style="color:#ff0080; font-weight:bold">(</span>column.names<span style="color:#ff0080; font-weight:bold">)</span></span> and <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#0080c0">fix</span><span style="color:#ff0080; font-weight:bold">(</span>crop.prod.data<span style="color:#ff0080; font-weight:bold">)</span></span> gives us the following output:</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#f27900"># output from &quot;fix(column.names)&quot;</span>
<span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;data.addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Rice</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.addColumn(</span>
<span style="color:#a68500"></span>  <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Wheat</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Pulses</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">)</span>
<span style="color:#a68500">  ;&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Cereals</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.addColumn(</span>
<span style="color:#a68500"></span>  <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Food.Grains</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Oil.</span>
<span style="color:#a68500">  Seeds</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Cotton</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.</span>
<span style="color:#a68500">  addColumn(</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">number</span><span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">,</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span> <span style="color:#a68500">Sugarcane</span> <span style="color:#ff00ff; font-weight:bold">\&quot;</span><span style="color:#a68500">);&quot;</span> <span style="color:#ff0080; font-weight:bold">)</span>

<span style="color:#f27900"># extracted output from &quot;fix(crop.prod.data)&quot; </span>
<span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;data.setValue( 0,0,4500 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.setValue( 1,0,5000 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.</span>
<span style="color:#a68500">  setValue( 2,0,6500 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.setValue( 3,0,1000 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.setValue(</span>
<span style="color:#a68500">  4,0,1750 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.setValue( 5,0,1000 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;data.setValue( 6,0,1750 </span>
<span style="color:#a68500">  );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> ... ... ... <span style="color:#a68500">&quot;data.setValue( 2,1,8600 );&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> ... ... ...</pre>
<p>Using any decent text editor (like Notepad++ or Komodo Edit [which I use]) makes getting rid of your unnecessary slashes and quotation marks a two second job, and then you are ready to do one last bit of copying and pasting, using the html <a href="http://code.google.com/apis/visualization/documentation/gallery/imagesparkline.html" target="_blank">on this page</a> as a guide. Here&#8217;s what my final html looked like:</p>
<pre class="brush: xml;">
&lt;html&gt;
  &lt;head&gt;
    &lt;script type=&quot;text/javascript&quot; src=&quot;http://www.google.com/jsapi&quot;&gt;&lt;/script&gt;
    &lt;script type=&quot;text/javascript&quot;&gt;
    google.load(&quot;visualization&quot;, &quot;1&quot;, {packages:[&quot;imagesparkline&quot;]});
    google.setOnLoadCallback(drawChart);
    function drawChart() {
    var data = new google.visualization.DataTable();
      data.addColumn(&quot;number&quot;, &quot; Rice &quot;);
	  data.addColumn(&quot;number&quot;, &quot; Wheat &quot;);
      data.addColumn(&quot;number&quot;, &quot; Pulses &quot;);
	  data.addColumn(&quot;number&quot;, &quot; Cereals &quot;);
      data.addColumn(&quot;number&quot;, &quot; Food Grains &quot;);
	  data.addColumn(&quot;number&quot;, &quot; Oil Seeds &quot;);
      data.addColumn(&quot;number&quot;, &quot; Cotton &quot;);
	  data.addColumn(&quot;number&quot;, &quot; Sugarcane &quot;);
    data.addRows(11);
      data.setValue( 0,0,4500 ); data.setValue( 1,0,5000 );
	  data.setValue( 2,0,6500 ); data.setValue( 3,0,1000 );
	  data.setValue( 4,0,1750 ); data.setValue( 5,0,1000 );
	  data.setValue( 6,0,1750 ); data.setValue( 7,0,1100 );
	  data.setValue( 8,0,1600 ); data.setValue( 9,0,1400 );
	  data.setValue( 10,0,1500 ); data.setValue( 0,1,7200 );
	  data.setValue( 1,1,8300 ); data.setValue( 2,1,8600 );
	  data.setValue( 3,1,4900 ); data.setValue( 4,1,5000 );
	  data.setValue( 5,1,4000 ); data.setValue( 6,1,7400 );
	  data.setValue( 7,1,7200 ); data.setValue( 8,1,6000 );
	  data.setValue( 9,1,7300 ); data.setValue( 10,1,6000 );
	  data.setValue( 0,2,3250 ); data.setValue( 1,2,3600 );
	  data.setValue( 2,2,3750 ); data.setValue( 3,2,2250 );
	  data.setValue( 4,2,3250 ); data.setValue( 5,2,2400 );
	  data.setValue( 6,2,3500 ); data.setValue( 7,2,3400 );
	  data.setValue( 8,2,3250 ); data.setValue( 9,2,3250 );
	  data.setValue( 10,2,2500 ); data.setValue( 0,3,2300 );
	  data.setValue( 1,3,2500 ); data.setValue( 2,3,2400 );
      data.setValue( 3,3,2100 ); data.setValue( 4,3,2700 );
	  data.setValue( 5,3,2400 ); data.setValue( 6,3,3400 );
	  data.setValue( 7,3,2300 ); data.setValue( 8,3,2300 );
      data.setValue( 9,3,1800 ); data.setValue( 10,3,1200 );
	  data.setValue( 0,4,17500 ); data.setValue( 1,4,19000 );
	  data.setValue( 2,4,22000 ); data.setValue( 3,4,10000 );
      data.setValue( 4,4,14000 ); data.setValue( 5,4,11000 );
	  data.setValue( 6,4,16500 ); data.setValue( 7,4,14000 );
	  data.setValue( 8,4,13000 ); data.setValue( 9,4,14000 );
      data.setValue( 10,4,12500 ); data.setValue( 0,5,5700 );
	  data.setValue( 1,5,5700 ); data.setValue( 2,5,5900 );
	  data.setValue( 3,5,4100 ); data.setValue( 4,5,4500 );
	  data.setValue( 5,5,3100 ); data.setValue( 6,5,5500 );
	  data.setValue( 7,5,4800 ); data.setValue( 8,5,5800 );
	  data.setValue( 9,5,5900 ); data.setValue( 10,5,6400 );
	  data.setValue( 0,6,510 ); data.setValue( 1,6,430 );
	  data.setValue( 2,6,420 ); data.setValue( 3,6,250 );
	  data.setValue( 4,6,400 ); data.setValue( 5,6,390 );
	  data.setValue( 6,6,650 ); data.setValue( 7,6,640 );
	  data.setValue( 8,6,750 ); data.setValue( 9,6,840 );
	  data.setValue( 10,6,870 ); data.setValue( 0,7,1650 );
	  data.setValue( 1,7,1600 ); data.setValue( 2,7,2000 );
	  data.setValue( 3,7,1650 ); data.setValue( 4,7,1600 );
	  data.setValue( 5,7,1550 ); data.setValue( 6,7,1750 );
	  data.setValue( 7,7,2000 ); data.setValue( 8,7,2500 );
	  data.setValue( 9,7,2750 ); data.setValue( 10,7,3250 );
    var chart = new google.visualization.ImageSparkLine(document.getElementById('chart_div'));
    chart.draw(data, {width: 170, height: 40, color: '#545454',
	showAxisLines: false, showValueLabels: false, labelPosition: 'left'});
    }
    &lt;/script&gt;
  &lt;/head&gt;

  &lt;body&gt;
    &lt;div id=&quot;chart_div&quot;&gt;&lt;/div&gt;
  &lt;/body&gt;
&lt;/html&gt;
</pre>
<p>The output (seen below) can easily be copied into a MS Word document and more information can be added to it as necessary.<br />
<center><iframe width='190' height='180' frameborder='0' src='http://db.tt/L2xanW'></iframe></center></p>
<p>By the way, here is <a href="http://news.mrdwab.com/wp-content/uploads/2010/06/sparklines-table.pdf">a PDF of a document created in OpenOffice.org</a> demonstrating what this might look like in a table (made using the EuroOffice sparkline plugin) as well as what a stacked line graph would look like. Both of these would make much better handouts during a presentation than the printout of slides shown earlier.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
<li><a href='http://news.mrdwab.com/2010-05-16/choropleth-party-with-r/' rel='bookmark' title='Permanent Link: It&#8217;s a choropleth party with R, and everyone&#8217;s invited'>It&#8217;s a choropleth party with R, and everyone&#8217;s invited</a> <small>Map party time. For some reason this happens every once...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Font experiments, Part 2</title>
		<link>http://news.mrdwab.com/2010-05-20/font-experiments-part-2/</link>
		<comments>http://news.mrdwab.com/2010-05-20/font-experiments-part-2/#comments</comments>
		<pubDate>Thu, 20 May 2010 11:25:57 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[experiments]]></category>
		<category><![CDATA[fonts]]></category>
		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=678</guid>
		<description><![CDATA[My initial excitement about the Google Font Directory is a little bit diminished right now. According to the Google Code Blog: The Google Font API hides a lot of complexity behind the scenes. Google’s serving infrastructure takes care of converting the font into a format compatible with any modern browser (including Internet Explorer 6 and [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-05-20/font-experiments/' rel='bookmark' title='Permanent Link: Font experiments'>Font experiments</a> <small>Google just launched the Google Font Directory (beta, of course)...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>My initial excitement about the Google Font Directory is a little bit diminished right now.</p>
<p><span id="more-678"></span></p>
<p>According to the <a href="http://googlecode.blogspot.com/2010/05/introducing-google-font-api-google-font.html">Google Code Blog</a>:</p>
<blockquote><p>The Google Font API hides a lot of complexity behind the scenes. Google’s serving infrastructure takes care of converting the font into a format compatible with any modern browser (including Internet Explorer 6 and up), sends just the styles and weights you select, and the font files and CSS are tuned and optimized for web serving.</p></blockquote>
<p>But here are a few screencaps of my site in different browsers:</p>

<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/chrome/' title='Chrome: Looks pretty much as I would expect it to, but not sure why there seems to be so much aliasing.'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/chrome-150x150.jpg" class="attachment-thumbnail" alt="Chrome" title="Chrome: Looks pretty much as I would expect it to, but not sure why there seems to be so much aliasing." /></a>
<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/opera/' title='Opera: Also, as expected, but similar concern about aliasing.'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/opera-150x150.jpg" class="attachment-thumbnail" alt="Opera" title="Opera: Also, as expected, but similar concern about aliasing." /></a>
<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/safari/' title='Safari (on Windows): Uses the right fonts, but they look overly &quot;heavy&quot; to me.'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/safari-150x150.jpg" class="attachment-thumbnail" alt="Safari (on Windows)" title="Safari (on Windows): Uses the right fonts, but they look overly &quot;heavy&quot; to me." /></a>
<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/ie7/' title='IE7: Headings are correct, but not the main text.'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/ie7-150x150.jpg" class="attachment-thumbnail" alt="IE7" title="IE7: Headings are correct, but not the main text." /></a>
<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/firefox/' title='Firefox: The big surprise for me! Nothing works!'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/firefox-150x150.jpg" class="attachment-thumbnail" alt="Firefox" title="Firefox: The big surprise for me! Nothing works!" /></a>
<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/specimen1/' title='The specimen page for IM Fell English SC on Chrome. Looks like a fun font.'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/specimen1-150x150.jpg" class="attachment-thumbnail" alt="Specimen page, Chrome" title="The specimen page for IM Fell English SC on Chrome. Looks like a fun font." /></a>
<a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/specimen2/' title='The specimen page for IM Fell English SC on Firefox. Looks like a pretty regular font.'><img width="150" height="150" src="http://news.mrdwab.com/wp-content/uploads/2010/05/specimen2-150x150.jpg" class="attachment-thumbnail" alt="Specimen page, Firefox" title="The specimen page for IM Fell English SC on Firefox. Looks like a pretty regular font." /></a>

<p>I guess this has inspired me to explore using the @font-face option though, so there might be some more font experiments on the way.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-05-20/font-experiments/' rel='bookmark' title='Permanent Link: Font experiments'>Font experiments</a> <small>Google just launched the Google Font Directory (beta, of course)...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-05-20/font-experiments-part-2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Font experiments</title>
		<link>http://news.mrdwab.com/2010-05-20/font-experiments/</link>
		<comments>http://news.mrdwab.com/2010-05-20/font-experiments/#comments</comments>
		<pubDate>Thu, 20 May 2010 05:55:57 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Website]]></category>
		<category><![CDATA[experiments]]></category>
		<category><![CDATA[fonts]]></category>
		<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=676</guid>
		<description><![CDATA[Google just launched the Google Font Directory (beta, of course) and the Google Font API which provides web-designers with an easy way to extend the font options that tend to limit many websites. Using the service is pretty simple: go to the font directory, find the font you want to use, and follow the simple [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/' rel='bookmark' title='Permanent Link: Font experiments, Part 2'>Font experiments, Part 2</a> <small>My initial excitement about the Google Font Directory is a...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Google just launched the <a href="http://code.google.com/webfonts" target="_blank">Google Font Directory</a> (beta, of course) and the Google Font API which provides web-designers with an easy way to extend the font options that tend to limit many websites. Using the service is pretty simple: go to the font directory, find the font you want to use, and follow the simple instructions under the &#8220;Get the code&#8221; tab. </p>
<p>The font list seems a bit limited at the moment, but to experiment with the feature, I&#8217;ve changed all of the post headings at this site to be displayed in <a href="http://code.google.com/webfonts/family?family=IM+Fell+English+SC" target="_blank">IM Fell English SC</a> and the body text to <a href="http://code.google.com/webfonts/family?family=Molengo" target="_blank">Molengo</a>. It would be great to see fonts like <a href="http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&#038;id=Gentium" target="_blank">Gentium</a> and <a href="http://www.linuxlibertine.org/index.php?id=1&#038;L=1" target="_blank">Linux Libertine</a> on there too.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-05-20/font-experiments-part-2/' rel='bookmark' title='Permanent Link: Font experiments, Part 2'>Font experiments, Part 2</a> <small>My initial excitement about the Google Font Directory is a...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-05-20/font-experiments/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>It&#8217;s a choropleth party with R, and everyone&#8217;s invited</title>
		<link>http://news.mrdwab.com/2010-05-16/choropleth-party-with-r/</link>
		<comments>http://news.mrdwab.com/2010-05-16/choropleth-party-with-r/#comments</comments>
		<pubDate>Sun, 16 May 2010 17:10:00 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[India]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[maps]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=663</guid>
		<description><![CDATA[Map party time. For some reason this happens every once in a while with me. A few years ago, I got to develop a website filled with choropleth maps galore. It was a pretty tedious process. Excel sheets. Photoshop. No good access to free Indian shapefiles. I was even thinking of making my own SVG [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/' rel='bookmark' title='Permanent Link: A little spark for presenting your data'>A little spark for presenting your data</a> <small>For some reason, I&#8217;ve been obsessing over the presentation of...</small></li>
<li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><img src="http://news.mrdwab.com/wp-content/uploads/2010/05/tn-pop-density-map-150x150.jpg" alt="Tamil Nadu Population Density" title="tn-pop-density-map" width="150" height="150" class="alignright size-thumbnail wp-image-669" />
<p>Map party time. For some reason this happens every once in a while with me. A few years ago, I got to develop a website filled with <a href="http://en.wikipedia.org/wiki/Choropleth_map" target="_blank">choropleth maps</a> galore. It was a pretty tedious process. Excel sheets. Photoshop. No good access to free Indian shapefiles. I was even thinking of making my own SVG files of Indian states <a href="http://news.mrdwab.com/2006-04-20/march-was-a-slow-month/">at one point</a> and thinking of a complex PHP and MySQL website.</p>
<p>Skip forward a few years now, and I&#8217;m back with the maps. Only this time, I have some new tools and resources: the software named after a <a href="http://www.r-projcet.org" target="_blank">pirate&#8217;s favorite letter</a>, some free maps from the <a href="http://gadm.org/" target="blank">Global Administrative Areas</a> website, some data from <a href="http://census2001.tn.nic.in/pca2001.aspx" target="_blank">the 2001 Indian Census</a> (I selected district data, all districts, and total population), and Google Docs (to clean up my CSV files).</p>
<p><span id="more-663"></span></p>
<p>Enough history. Let&#8217;s get started mapping, and since I&#8217;m in Tamil Nadu, I&#8217;m going to restrict myself to that state.</p>
<p>First, download the <a href="http://gadm.org/data/rda/IND_adm2.RData">district level RData file</a> from the GADM website (it&#8217;s a little under 7MB). Double-click on the file to open the R workspace. By using <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">ls</span><span style="color:#ff0080; font-weight:bold">()</span> </span> we can see that the only object in this workspace is named &#8220;gadm&#8221;. At this point, you can&#8217;t actually do much. Typing <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">fix</span><span style="color:#ff0080; font-weight:bold">(</span>gadm<span style="color:#ff0080; font-weight:bold">)</span></span> just pops up an editor window that says <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><br />
<span style="color:#ff0080; font-weight:bold">&lt;</span>S4 object of <span style="color:#0080c0">class structure</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;SpatialPolygonsDataFrame&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> package <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#a68500">&quot;sp&quot;</span><span style="color:#ff0080; font-weight:bold">)&gt;</span></span> which isn&#8217;t really useful. We need to load the libraries we&#8217;ll need:</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">
&gt;</span> <span style="color:#0080c0">library</span><span style="color:#ff0080; font-weight:bold">(</span>sp<span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">library</span><span style="color:#ff0080; font-weight:bold">(</span>RColorBrewer<span style="color:#ff0080; font-weight:bold">)</span>
</pre>
<p>Since I want to see what is in the gadm object, I can just write a CSV of the object by typing <span style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">write.csv</span><span style="color:#ff0080; font-weight:bold">(</span>gadm<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;gadm-data.csv&quot;</span><span style="color:#ff0080; font-weight:bold">)</span></span>. This will get us the following file:</p>
<p><center><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdGlvd3p3UVdwZFl0ZkdjZlRLallmS2c&#038;hl=en&#038;single=true&#038;gid=0&#038;output=html&#038;widget=true'></iframe></center></p>
<p>As you can see, this spreadsheet is for all the districts in India, but I&#8217;m only interested in Tamil Nadu. Furthermore, without first sorting, not all of the Tamil Nadu districts are presented together. To fix this, I can simply create a new object for Tamil Nadu. For this, I only want the rows where the value in the &#8220;NAME_1&#8243; column is &#8220;Tamil Nadu&#8221;. It might also be easier for me if I have a CSV of this new object so that I can arrange my data from the census in the right order, and it&#8217;s also easier if the district names (which are under the variable &#8220;NAME_2&#8243;) are in alphabetical order.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">
&gt;</span> Tamil.Nadu <span style="color:#ff0080; font-weight:bold">=</span> gadm<span style="color:#ff0080; font-weight:bold">[</span>gadm$NAME_1<span style="color:#ff0080; font-weight:bold">==</span><span style="color:#a68500">&quot;Tamil Nadu&quot;</span><span style="color:#ff0080; font-weight:bold">,]</span> <span style="color:#f27900"># Select only Tamil Nadu</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> Tamil.Nadu <span style="color:#ff0080; font-weight:bold">=</span> Tamil.Nadu<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#0080c0">order</span><span style="color:#ff0080; font-weight:bold">(</span>Tamil.Nadu$NAME_2<span style="color:#ff0080; font-weight:bold">),]</span> <span style="color:#f27900"># Order by district</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">write.csv</span><span style="color:#ff0080; font-weight:bold">(</span>Tamil.Nadu<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;TN_db_raw.csv&quot;</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># Write output as CSV</span></pre>
<p>Fortunately, the names for the districts in the GADM dataset mostly match with those from the Indian census, so matching the data is quite straightforward. I&#8217;ve gone ahead and <a href="http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdEJjVHhpalBfVng1dW9oMTI0RXp0TVE&#038;hl=en&#038;single=true&#038;gid=0&#038;output=html" target="_blank">uploaded the combined information</a> to Google Docs.</p>
<p><center><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdEJjVHhpalBfVng1dW9oMTI0RXp0TVE&#038;hl=en&#038;single=true&#038;gid=0&#038;output=html&#038;widget=true'></iframe></center></p>
<p>Let&#8217;s load this new file.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">
&gt;</span> TN.DB <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.csv</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;http://news.mrdwab.com/tnpop&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> header<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">names</span><span style="color:#ff0080; font-weight:bold">(</span>TN.DB<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># check to get variable names</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;NAME_2&quot;</span>     <span style="color:#a68500">&quot;Total_HH&quot;</span>   <span style="color:#a68500">&quot;Total_Pop&quot;</span>  <span style="color:#a68500">&quot;Male_Pop&quot;</span>   <span style="color:#a68500">&quot;Female_Pop&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">6</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;Sex_Ratio&quot;</span>  <span style="color:#a68500">&quot;Area&quot;</span>       <span style="color:#a68500">&quot;Pop_Dens&quot;</span></pre>
<p>And now, let&#8217;s create a map. While it might be tempting to create a map of, say, total population, that&#8217;s not really how choropleth maps should be used. From the data here, without doing any further calculations, the only variable that makes sense to map is population density. I&#8217;ll start by doing a quick summary and plot to see what the data look like</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#f27900">
# Use &quot;digits=&quot; to make sure that R doesn't round our results</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">summary</span><span style="color:#ff0080; font-weight:bold">(</span>TN.DB$Pop_Dens<span style="color:#ff0080; font-weight:bold">,</span> digits<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#800080; font-weight:bold">6</span><span style="color:#ff0080; font-weight:bold">)</span>
     Min.   <span style="color:#800080; font-weight:bold">1</span>st Qu.    Median      Mean   <span style="color:#800080; font-weight:bold">3</span>rd Qu.      Max.
  <span style="color:#800080; font-weight:bold">278.870   343.480   416.850  1291.610   605.765 24963.500</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">plot</span><span style="color:#ff0080; font-weight:bold">(</span>TN.DB$Pop_Dens<span style="color:#ff0080; font-weight:bold">)</span></pre>
<div id="attachment_670" class="wp-caption aligncenter" style="width: 410px"><img src="http://news.mrdwab.com/wp-content/uploads/2010/05/tn-pop-density-400x395.jpg" alt="Plot of population density" title="tn-pop-density" width="400" height="395" class="size-medium wp-image-670" /><p class="wp-caption-text">Notice the outlier. This will definitely affect our data if we don't deal with it.</p></div>
<p>From our plot, I can see that there is an outlier in the data. This is useful to know since if I simply went ahead and created our &#8220;bins&#8221; with that data point included, I wouldn&#8217;t end up capturing the variance between the lower values: they would most likely all be put into one bin.</p>
<p>In this map, I&#8217;m going to be lazy and simply divide the population density range into bins that are (more-or-less) equal sizes. You can also use quartiles if you prefer.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#f27900">
# Set your lower and upper limits, excluding the outlier, and length </span>
<span style="color:#f27900"># equal to one more than the number of bins you actually want.</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> TNPopDense <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">c</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">round</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">seq</span><span style="color:#ff0080; font-weight:bold">(</span>from<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#800080; font-weight:bold">0</span><span style="color:#ff0080; font-weight:bold">,</span> to<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#800080; font-weight:bold">1100</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">length</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#800080; font-weight:bold">8</span><span style="color:#ff0080; font-weight:bold">),</span> digits<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#800080; font-weight:bold">0</span><span style="color:#ff0080; font-weight:bold">),</span>
<span style="color:#ff0080; font-weight:bold">+</span>              <span style="color:#0080c0">round</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">max</span><span style="color:#ff0080; font-weight:bold">(</span>TN.DB$Pop_Dens<span style="color:#ff0080; font-weight:bold">)))</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> TNPopDense <span style="color:#f27900"># Preview our breaks</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span>     <span style="color:#800080; font-weight:bold">0   157   314   471   629   786   943  1100 24964</span>
<span style="color:#f27900"># Use our breaks to put each of the values in the &quot;Pop_Dens&quot; column</span>
<span style="color:#f27900"># into its corresponding bin.</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> PopDenseRange <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">as.factor</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">as.numeric</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#0080c0">cut</span><span style="color:#ff0080; font-weight:bold">(</span>TN.DB$Pop_Dens<span style="color:#ff0080; font-weight:bold">,</span> TNPopDense<span style="color:#ff0080; font-weight:bold">)))</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> PopDenseRange <span style="color:#f27900"># Preview which bins the districts fall into.</span>
 <span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#800080; font-weight:bold">3 8 4 4 2 3 3 5 7 3 5 4 3 3 2 2 2 4 2 5 3 5 4 3 4 3 3 4 3 3</span>
Levels<span style="color:#ff0080; font-weight:bold">:</span> <span style="color:#800080; font-weight:bold">2 3 4 5 7 8</span></pre>
<blockquote><p>Note that this may not have been the best way to do this&#8211;I wanted to put each district in one of eight levels, preferably using all levels, but they only fall into six levels with this particular set of data.</p></blockquote>
<p>Now, using the levels that we got in the previous step, we can work on our &#8220;legend&#8221; for the map. We can use &#8220;&gt;&nbsp;157&#8243;, &#8220;157 &#8211; 314&#8243;, and so on as our legend entries. After that, we merge a column into our TN.DB object to store these new values.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';"><span style="color:#ff0080; font-weight:bold">
&gt;</span> <span style="color:#0080c0">levels</span><span style="color:#ff0080; font-weight:bold">(</span>PopDenseRange<span style="color:#ff0080; font-weight:bold">) =</span> <span style="color:#0080c0">list</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;&lt; 157&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;1&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;157 - 314&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;2&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;315 - 471&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;3&quot;</span><span style="color:#ff0080; font-weight:bold">,</span>
<span style="color:#ff0080; font-weight:bold">+</span>                         <span style="color:#a68500">&quot;472 - 629&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;4&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;630 - 786&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;5&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;787 - 943&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;6&quot;</span><span style="color:#ff0080; font-weight:bold">,</span>
<span style="color:#ff0080; font-weight:bold">+</span>                         <span style="color:#a68500">&quot;943 - 1,100&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;7&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;&gt; 1,100&quot;</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;8&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> TN.DB$PopDenseRange <span style="color:#ff0080; font-weight:bold">=</span> PopDenseRange <span style="color:#f27900"># Merging info</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> shadePopDense <span style="color:#ff0080; font-weight:bold">=</span> brewer.pal<span style="color:#ff0080; font-weight:bold">(</span><span style="color:#800080; font-weight:bold">8</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;Blues&quot;</span><span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># Setting up the coloring scheme</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> Tamil.Nadu$PopDenseRange <span style="color:#ff0080; font-weight:bold">=</span> PopDenseRange
<span style="color:#f27900"># And now we plot it!</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> PopDensePlot <span style="color:#ff0080; font-weight:bold">=</span> spplot<span style="color:#ff0080; font-weight:bold">(</span>Tamil.Nadu<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#a68500">&quot;PopDenseRange&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">col</span><span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;blue&quot;</span><span style="color:#ff0080; font-weight:bold">,</span>
 <span style="color:#ff0080; font-weight:bold">+</span>               col.regions<span style="color:#ff0080; font-weight:bold">=</span>shadePopDense<span style="color:#ff0080; font-weight:bold">,</span>
 <span style="color:#ff0080; font-weight:bold">+</span>               main<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;Tamil Nadu population density by district&quot;</span><span style="color:#ff0080; font-weight:bold">)</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> PopDensePlot</pre>
<div id="attachment_669" class="wp-caption aligncenter" style="width: 410px"><a href="http://news.mrdwab.com/wp-content/uploads/2010/05/tn-pop-density-map.jpeg" rel="lightbox[663]"><img src="http://news.mrdwab.com/wp-content/uploads/2010/05/tn-pop-density-map-400x395.jpg" alt="Tamil Nadu Population Density" title="tn-pop-density-map" width="400" height="395" class="size-medium wp-image-669" /></a><p class="wp-caption-text">Here's our final map of population density.</p></div>
<p>And that&#8217;s pretty much it! I&#8217;m sure there&#8217;s a lot of room for improvement in the code and the process, but overall, not too bad for just a few lines of syntax in R. [R also exports <a href="http://news.mrdwab.com/wp-content/uploads/2010/05/tn-pop-density-map.pdf">really nice PDF files</a>.]</p>
<p>Aside from referring to the typical R documentation, <a href="http://ryouready.wordpress.com/2009/11/16/infomaps-using-r-visualizing-german-unemployment-rates-by-color-on-a-map/" target="_blank">this example</a> really helped me to figure out what I needed to do.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/' rel='bookmark' title='Permanent Link: A little spark for presenting your data'>A little spark for presenting your data</a> <small>For some reason, I&#8217;ve been obsessing over the presentation of...</small></li>
<li><a href='http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/' rel='bookmark' title='Permanent Link: Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R'>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</a> <small>A lot of the times, students at the Academy enter...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-05-16/choropleth-party-with-r/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Quickly reshaping data from &#8220;wide&#8221; to &#8220;long&#8221; formats in R</title>
		<link>http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/</link>
		<comments>http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/#comments</comments>
		<pubDate>Sun, 18 Apr 2010 08:57:45 +0000</pubDate>
		<dc:creator>Ananda</dc:creator>
				<category><![CDATA[(all categories)]]></category>
		<category><![CDATA[Geekiness]]></category>
		<category><![CDATA[Useless Knowledge]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[data manipulation]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[reshape]]></category>

		<guid isPermaLink="false">http://news.mrdwab.com/?p=638</guid>
		<description><![CDATA[A lot of the times, students at the Academy enter data in a &#8220;wide&#8221; format (since it is a very natural way to enter data in a spreadsheet). Let&#8217;s say, for example, that they were collecting data for a household, and for each person, they were collecting information on three variables. Assume also that they [...]


Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/' rel='bookmark' title='Permanent Link: A little spark for presenting your data'>A little spark for presenting your data</a> <small>For some reason, I&#8217;ve been obsessing over the presentation of...</small></li>
<li><a href='http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/' rel='bookmark' title='Permanent Link: Using the reshape package in R for pivot-table-like functionality'>Using the reshape package in R for pivot-table-like functionality</a> <small>A little more than a week ago, I wrote about...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>A lot of the times, students at the Academy enter data in a &#8220;wide&#8221; format (since it is a very natural way to enter data in a spreadsheet). Let&#8217;s say, for example, that they were collecting data for a household, and for each person, they were collecting information on three variables. Assume also that they were only collecting information about five household members. They might end up with a first row of column names something like &#8220;HouseholdID&#8221; | &#8220;member.01&#8243; | &#8220;member.02&#8243; | &#8220;member.03&#8243; | &#8220;member.04&#8243; | &#8220;member.05&#8243; | &#8220;variable1.01&#8243; | &#8220;variable1.02&#8243; | &#8220;variable1.03&#8243; | &#8220;variable1.04&#8243; | &#8220;variable1.05&#8243; | &#8220;variable2.01&#8243; | &#8220;variable2.02&#8243; &#8230; and so on. Sometimes, however, we may find it more useful to have our data in a &#8220;long&#8221; format. This post tells you how to quickly do that using <a href="http://www.r-project.org">R</a>.</p>
<p><span id="more-638"></span></p>
<p>Here&#8217;s an example spreadsheet with some nonsense data entered. </p>
<div align="center"><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdGphRzQzTnNlZHMxUHNkSExtQ3lwM0E&#038;hl=en&#038;single=true&#038;gid=0&#038;output=html&#038;widget=true'></iframe></div>
<p>The data is fine in this format, but it makes it difficult to, say, look at everyone who, for ItemC, has &#8220;red&#8221; as their choice. Below is an example of the same data in the spreadsheet above in long format, from which we can easily determine the answer to such a question.</p>
<div align="center"><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdGphRzQzTnNlZHMxUHNkSExtQ3lwM0E&#038;hl=en&#038;single=true&#038;gid=1&#038;output=html&#038;widget=true'></iframe></div>
<p>Now, you can get to those results with some painstaking cut-and-paste work, or you can do the smart thing and load your data into R and be done with your work in just a few lines of code. Here&#8217;s how.</p>
<p>Your first step is to get the data into R. How you do this depends on the current format of your data. This example uses a comma separated value (CSV) file stored using Google Spreadsheets. You can use the same URL, or you can use a locally stored file. (I&#8217;ve shortened the URL with a cool WordPress plugin called <a href="http://blairwilliams.com/pretty-link/">Pretty Link</a>.) I&#8217;ll call the data frame &#8220;aa&#8221;. Since I know that the CSV file has a header row, I&#8217;ll add that argument in for R so that it will add column names automatically.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">

<span style="color:#ff0080; font-weight:bold">&gt;</span> aa <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">read.csv</span><span style="color:#ff0080; font-weight:bold">(</span><span style="color:#a68500">&quot;http://news.mrdwab.com/reshape&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> header<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#8080c0; font-weight:bold">T</span><span style="color:#ff0080; font-weight:bold">)</span>
</pre>
<p>At this point, if you want to see what you&#8217;ve imported, you can use the &quot;fix&quot; function which will open up R&#8217;s data browser. I&#8217;m only interested in seeing the column names and seeing how many columns there are so instead, I&#8217;ve just used &quot;names(aa)&quot; which, as you can see below, lets me know that there are 21 columns in this data frame.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">

<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">names</span><span style="color:#ff0080; font-weight:bold">(</span>aa<span style="color:#ff0080; font-weight:bold">)</span>
 <span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">1</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;Unique.var&quot;</span> <span style="color:#a68500">&quot;UnitID.01&quot;</span>  <span style="color:#a68500">&quot;UnitID.02&quot;</span>  <span style="color:#a68500">&quot;UnitID.03&quot;</span>  <span style="color:#a68500">&quot;UnitID.04&quot;</span>
 <span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">6</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;UnitID.05&quot;</span>  <span style="color:#a68500">&quot;ItemA.01&quot;</span>   <span style="color:#a68500">&quot;ItemA.02&quot;</span>   <span style="color:#a68500">&quot;ItemA.03&quot;</span>   <span style="color:#a68500">&quot;ItemA.04&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">11</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;ItemA.05&quot;</span>   <span style="color:#a68500">&quot;ItemB.01&quot;</span>   <span style="color:#a68500">&quot;ItemB.02&quot;</span>   <span style="color:#a68500">&quot;ItemB.03&quot;</span>   <span style="color:#a68500">&quot;ItemB.04&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">16</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;ItemB.05&quot;</span>   <span style="color:#a68500">&quot;ItemC.01&quot;</span>   <span style="color:#a68500">&quot;ItemC.02&quot;</span>   <span style="color:#a68500">&quot;ItemC.03&quot;</span>   <span style="color:#a68500">&quot;ItemC.04&quot;</span>
<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#800080; font-weight:bold">21</span><span style="color:#ff0080; font-weight:bold">]</span> <span style="color:#a68500">&quot;ItemC.05&quot;</span>
</pre>
<p>The following is where the reshaping happens. We need to tell R what data is being reshaped (in this example, &quot;aa&quot;), which direction we want (&quot;wide&quot; or &quot;long&quot;), and which columns &quot;vary&quot; (in this example, columns 2 through 21; this is where the &quot;names&quot; command comes in useful. As we can easily see in the step above, there are 21 columns in this data frame.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">

<span style="color:#ff0080; font-weight:bold">&gt;</span> bb <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">reshape</span><span style="color:#ff0080; font-weight:bold">(</span>aa<span style="color:#ff0080; font-weight:bold">,</span> direction<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#a68500">&quot;long&quot;</span><span style="color:#ff0080; font-weight:bold">,</span> varying<span style="color:#ff0080; font-weight:bold">=</span><span style="color:#800080; font-weight:bold">2</span><span style="color:#ff0080; font-weight:bold">:</span><span style="color:#800080; font-weight:bold">21</span><span style="color:#ff0080; font-weight:bold">)</span>
</pre>
<blockquote><p><em><strong>Note:</strong></em> The code presented above is the simplest form of using the &#8220;reshape&#8221; command in R. Depending on how you have named the variables in your actual dataset, you might need to add a few arguments. For example, R will automatically use the period as a separator, but if your variables are named &#8220;variableA_01&#8243;, &#8220;variableA_02&#8243; and so on, you will need to specify this information to R by adding <code>sep="_"</code> to the reshape command.</p></blockquote>
<p>At this point, you can use a simple &#8220;fix&#8221; command to see what your data frame looks like.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">

<span style="color:#ff0080; font-weight:bold">&gt;</span> <span style="color:#0080c0">fix</span><span style="color:#ff0080; font-weight:bold">(</span>bb<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># &quot;bb&quot; is the name of the data frame we want to see, right?</span>
</pre>
<p>It should look like this:</p>
<div align="center"><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=0An2f7Ho_4e0fdGphRzQzTnNlZHMxUHNkSExtQ3lwM0E&#038;hl=en&#038;single=true&#038;gid=2&#038;output=html&#038;widget=true'></iframe></div>
<p>This is OK, but not great. First, you can see that R has sorted the data frame in a somewhat strange way. For example, I cannot see all of the information about Unique.var &#8220;1&#8243; together since the first set of records is on the first row of the data (omitting the header), and the second set of records is on the twentieth row of data. There are also a lot of cells with NA values which, if removed, would make the dataset easier to view.</p>
<pre style="color:#000000; background-color:#eeeeee; font-size:8pt; font-family:'Courier New';">

<span style="color:#ff0080; font-weight:bold">&gt;</span> cc <span style="color:#ff0080; font-weight:bold">=</span> bb<span style="color:#ff0080; font-weight:bold">[</span><span style="color:#0080c0">with</span><span style="color:#ff0080; font-weight:bold">(</span>bb<span style="color:#ff0080; font-weight:bold">,</span> <span style="color:#0080c0">order</span><span style="color:#ff0080; font-weight:bold">(</span>Unique.var<span style="color:#ff0080; font-weight:bold">)),]</span> <span style="color:#f27900"># Order the data frame by &quot;Unique.var&quot;</span>
<span style="color:#ff0080; font-weight:bold">&gt;</span> dd <span style="color:#ff0080; font-weight:bold">=</span> <span style="color:#0080c0">na.omit</span><span style="color:#ff0080; font-weight:bold">(</span>cc<span style="color:#ff0080; font-weight:bold">)</span> <span style="color:#f27900"># Get rid of the NA cells.</span>
</pre>
<p>After running those two lines, you&#8217;ll have <em>almost</em> what we had in the second spreadsheet in this post. From there, it&#8217;s just a matter of saving your final data frame as a CSV file, opening that in a spreadsheet program (I just use Google Docs), and delete the columns you don&#8217;t need (in this example, the first [unnamed] column, &#8220;time&#8221;, and &#8220;id&#8221;). Of course, when you are doing this, you don&#8217;t <em>need</em> to assign a new name to your data frames at each step. I only do it so that I can easily compare what is going on from change to change.</p>


<p>Related posts (possibly):<ol><li><a href='http://news.mrdwab.com/2010-07-11/getting-data-into-r/' rel='bookmark' title='Permanent Link: Getting data into R'>Getting data into R</a> <small>When you first open R, you&#8217;re greeted with a screen...</small></li>
<li><a href='http://news.mrdwab.com/2010-06-17/a-little-spark-for-presenting-your-data/' rel='bookmark' title='Permanent Link: A little spark for presenting your data'>A little spark for presenting your data</a> <small>For some reason, I&#8217;ve been obsessing over the presentation of...</small></li>
<li><a href='http://news.mrdwab.com/2010-08-08/using-the-reshape-packagein-r/' rel='bookmark' title='Permanent Link: Using the reshape package in R for pivot-table-like functionality'>Using the reshape package in R for pivot-table-like functionality</a> <small>A little more than a week ago, I wrote about...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://news.mrdwab.com/2010-04-18/reshaping-wide-to-long-in-r/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
