Mr Razor’s Favorite Thing

Every so often, and for no apparent reason,
the sly razor demon emerges with a smile and says,
“Off with his beard! Off with his beard!”
and before I can ask “Why?”
my facial hair has disappeared.

Today, that happened to me again,
and as usual, it was out of my control.
I almost didn’t even realize what happened,
until I stepped outside and my chin was cold.

For your listening enjoyment:

Mr Razor’s Favorite Thing: MP3 | OGG

(Apologies to Rodgers and Hammerstein, Julie Andrews, and John Coltrane.)

Picnic With a Gun – Kowalski 14 inch EP

Picnic With A Gun Kowalski 14 inch EP

From the liner notes:

PICNIC WITH A GUN: An Introduction.

As every band is, we are often asked to describe our music. We’ve done
so in a variety of different ways. We’ve listed our influences,
compared ourselves to other bands, and even invited terms like
industriopunkrockskafunkpopdanceabilly. All of these things are
stupid. We’re just PWAG. This is not to say that we’ve invented some
new, cutting edge sound that will soon become the hottest genre.
Instead, we think we’ve infused a variety of styles together and came
up with something that could, at best, be described as “neat”.

splitstackshape V1.4.0 for R

After more than a year since splitstackshape V1.2.0, I’ve finally gotten around to making some major updates and submitting the package to CRAN.

So, if you have messed up datasets filled with concatenated cells of data, and you need to split that data up and reorganize it for later analysis, install and load the latest version (V1.4.0) of splitstackshape with:

## [1] '1.4.0'

Read on for details!

Continue reading

The “splitstackshape” package for R

A while ago, a friend of ours presented me with a data problem. Her questionnaire had some questions where the respondent could provide multiple responses. You know, the “Check as many as apply” type of questions. One way that this data is commonly stored is to put a comma separated value into a single cell in a spreadsheet. In fact, if you use something like Google Forms to collect your data and have questions that use check-boxes, that’s how your data will finally be stored in a Google Spreadsheet.

Continue reading

What exactly is “elegant” code in R?

In celebration of my achieving 10,000 “reputation” on Stack Overflow, I’m re-posting one of my questions from there that was (as I had expected) deleted after being live for about 5 hours. In that time, I never really got a satisfactory answer, so if anyone wants to offer one in the comments, that would be great!

Continue reading

We two, ours one

On the trucks around town…

Anyone who has spent some time in India is sure to have noticed the slogans painted on the back of trucks, autos, and other vehicles advising “we two, ours one”. This is part of India’s “family planning” efforts–efforts which have had a pretty bumpy history that included a forced sterilization program.

Originally, the slogans were “we two, ours two”, or at least that was the catchy English version–regional languages usually had a slogan more along the lines of “one family, two children”. And, the change to the new slogan led to at least one humorous math discussion with an auto driver who commented that, “Earlier, it was ‘we two, ours two'; now, it is ‘we two, ours one’. What’s next? ‘We two, ours half?'”

Anyway, keen observers might have noticed the following new addition to selected trucks:

We two, ours one

We two, ours one

Stratified random sampling in R from a data frame

Important update

The original function that was present at this post has been deleted. Instead, I’ve posted a much improved version for the sake of others visiting this page. The function is presently defined as:


  • df: The input data.frame
  • group: The grouping column(s). Can be a character vector or the numeric positions of the columns.
  • size: The desired sample size. Can be a decimal (proportionate by group) or an integer (same number of samples per group).
  • select: A named list with optional subsetting statements.
  • replace: Logical. Should sampling be done with or without replacement.
  • bothSets: Logical. Should a list be returned. Useful when setting up a “testing” and “training” sampling setup.


And here are some examples of the function in action:

There is also a data.table version that is much more efficient but has the same functionality.

Reshaping data in R revisited

A year ago, I wrote a post about reshaping data from a wide format to a long format. I thought that considering how much time had passed, it would be good to revisit R’s in-built reshape functions. For these examples, I’ve copied the Stata examples from the UCLA Academic Technology Services’s “Reshape data wide to long” page. Since the data is provided in Stata dta files, you need to first load the “foreign” package to be able to read the data in R.

Continue reading