… here is a goddess that I am happy to worship…
Anyone who has spent some time in India is sure to have noticed the slogans painted on the back of trucks, autos, and other vehicles advising “we two, ours one”. This is part of India’s “family planning” efforts–efforts which have had a pretty bumpy history that included a forced sterilization program.
Originally, the slogans were “we two, ours two”, or at least that was the catchy English version–regional languages usually had a slogan more along the lines of “one family, two children”. And, the change to the new slogan led to at least one humorous math discussion with an auto driver who commented that, “Earlier, it was ‘we two, ours two’; now, it is ‘we two, ours one’. What’s next? ‘We two, ours half?’”
Anyway, keen observers might have noticed the following new addition to selected trucks:
IMPORTANT: This is here mostly to remind me of how I solved my problem. You should read Stratified random sampling in R from a data frame if you really want to use this function.
I know that sampling is quite complex, and I will admit that I know very little about its complexities. Fortunately, software like R lets you draw simple random samples pretty easily, either either with or without replacement. Unfortunately, I could not find any feature to allow me to do simple stratified random sampling, at least not with the features I was looking for. Fortunately again, with a little bit of experimenting, it can be pretty easy to learn how to write functions in R when a direct solution does not present itself.
This post shares my initial “work-in-progress” on writing an R function for stratified sampling.
I’ve been meaning to learn how to use regular expressions for quite some time now, but just never seemed to get around to doing so. The other night, I decided to take a stab at them though, and over the past few days, I’ve sort of managed to learn a few tricks. Some of these might seem unnecessary, particularly since the examples comprise relatively small chunks of text. But, hopefully you can also see the application of the same techniques for larger text files. In some of the examples, I’ve also included how it might help with preparing your data for use with a program like R. For all of these examples, I’ve used Geany as my text editor. I suggest you use a good text editor like Geany or Notepad++ too.
When people begin the study of communication, their attitudes vary anywhere from “I think this would be a very important class: it is important to understand the communication process if I want to improve the effectiveness of my communication,” to “What a waste of time. I’ve been communicating all my life. Do I really need to take a course to understand communication?”
Whether or not we take a course in communication, there is considerable value in trying to refine our understanding of communication. To demonstrate, I will present two class exercises. In describing the exercises, hopefully some of the jargon common in the communications discipline (for example, encoding, decoding, channel, and congruence) will become clearer, and you will be at least a little more sensitive to trying to verify the effectiveness of your everyday communication approaches.
Yesterday, at DHAN Foundation’s “Foundation Day” celebration, the students from PDM 11 of the Tata-Dhan Academy and others were able to have–after a really long break–a new issue of Spectrum: Colours of Development in their hands to look at (and hopefully read). Spectrum is the student newsletter of the Academy; students contribute articles, solicit articles from faculty, and do the shortlisting and preliminary editing before passing it on to me for further polishing.
One of the things I’m happiest about is that we were able to do this issue entirely using open-source tools. In the past, PageMaker and CorelDraw have been used. These programs are good, but in the long run, we would like the students to create (including design) these newsletters entirely on their own, and seeing that we don’t have copies of these programs for the students to use and that I don’t encourage the students to use pirated software (see Am I inconsistent?), I thought it would be good to experiment with this issue and do the design entirely using open-source programs.
So, for that, we used Inkscape and Gimp for all of the graphics, and Scribus for the layout. Since there were a couple of tables in this issue and Scribus has terrible support for tables at the moment, we used OpenOffice.org Writer to design the table, copied the table into OpenOffice.org Draw, exported the table as an EPS file, and imported that as a vector graphic into Scribus. Kind of a roundabout way, I know, but then again, as far as I remember neither PageMaker nor CorelDraw are that great for tables either. For the fonts, I decided to use the Linux Libertine font family both because it is open type and because it is a really nice font. And, for hosting online, we decided to use WordPress (though we are using the .com variant, rather than the .org variant–for now at least).
Overall, I’m happy with the results! Check it out for yourself: Spectrum – Issue 4.
aka “Maybe I shouldn’t post so quickly“
Just hours ago, I posted my first set of functions for R to determine the sample size for a known population. Then, I had to update that post to reflect my newfound knowledge, and now, I thought I would update again, so that the best functions I came up with would all be in one place. There are two functions,
sample.size() which you can easily load in R by typing
source("http://news.mrdwab.com/samplesize") at the command prompt. Here’s some more information about each.