The 'koboloader' package for R

The “koboloadeR” R package is designed to make it easy to retrieve data collected using the KoBo Toolbox or other services using the same API (for example, KoBo Humanitarian Response, Ona). Of these, KoBo Toolbox is quite generous with 1,500 submissions per user per month as their limit (and I believe something like 5GB per month for data submissions–like photos, videos, or audio recordings that can be linked to a survey).

The package is available at GitHub. Get it using:


(Note: install_github via @jtilly)

I won’t go into details here about KoBo. It’s an awesome tool for quick mobile data collection. Developing the survey tools is very easy and provides you with a good range of question types. You can collect data while offline and sync it with the server when you have a data connection available. And, you can export data into different forms for analysis later on.

Read on for more details!

splitstackshape V1.4.0 for R

After more than a year since splitstackshape V1.2.0, I’ve finally gotten around to making some major updates and submitting the package to CRAN.

So, if you have messed up datasets filled with concatenated cells of data, and you need to split that data up and reorganize it for later analysis, install and load the latest version (V1.4.0) of splitstackshape with:

## [1] '1.4.0'

Read on for details!

An R function like "order" from Stata

The "splitstackshape" package for R

A while ago, a friend of ours presented me with a data problem. Her questionnaire had some questions where the respondent could provide multiple responses. You know, the "Check as many as apply" type of questions. One way that this data is commonly stored is to put a comma separated value into a single cell in a spreadsheet. In fact, if you use something like Google Forms to collect your data and have questions that use check-boxes, that's how your data will finally be stored in a Google Spreadsheet.

What exactly is "elegant" code in R?

In celebration of my achieving 10,000 “reputation” on Stack Overflow, I’m re-posting one of my questions from there that was (as I had expected) deleted after being live for about 5 hours. In that time, I never really got a satisfactory answer, so if anyone wants to offer one in the comments, that would be great!

Stratified random sampling in R from a data frame

Important update The original function that was present at this post has been deleted. Instead, I’ve posted a much improved version for the sake of others visiting this page. The function is presently defined as: Arguments df: The input data.frame group: The grouping column(s). Can be a character vector or the numeric positions of the columns. size: The desired sample size. Can be a decimal (proportionate by group) or an integer (same number of samples per group).

Reshaping data in R revisited

A year ago, I wrote a post about reshaping data from a wide format to a long format. I thought that considering how much time had passed, it would be good to revisit R’s in-built reshape functions. For these examples, I’ve copied the Stata examples from the UCLA Academic Technology Services’s “Reshape data wide to long” page. Since the data is provided in Stata dta files, you need to first load the “foreign” package to be able to read the data in R.

The new sample size calculator for R (already)

aka “Maybe I shouldn’t post so quickly

Just hours ago, I posted my first set of functions for R to determine the sample size for a known population. Then, I had to update that post to reflect my newfound knowledge, and now, I thought I would update again, so that the best functions I came up with would all be in one place. There are two functions, sample.size.table() and sample.size(). Here’s some more information about each.

Simple sampling with R

I mentioned in an earlier post (“Am I inconsistent?”) that I got interested in R because Amy had asked me to help her with some sampling at one point. Since that was my starting point, I thought I would share some of my experiments with you. In this post:

  1. Simple random sampling
  2. Simple random sampling with a seed
  3. Sorting your sample