Recently we had to analyze the data of the number of visits per day to SimplyStatistics.org. There were two goals:
- Estimate the fraction of visitors retained after a spike in the number of visitors
- Identify (if any) any factors that influence the fraction estimated in 1.
For me it was a fun project in part because I like SimplyStatistics but also because I think that finding the answers to the questions would be interesting and help understand the readers of that blog.
Half joking with other students, I said that I basically did t-tests. Hopefully I can work on changing this tendency with the pile of recommended books I’ve been acquiring but not really reading through. Except for the ggplot2: Elegant Graphics for Data Analysis and the R Graphics Cookbook. Sounds like spring break will be fun :P
Kind of related to this, Jeff Leek announced yesterday that he is going to compile a list of student blogs that have something to do with statistics and data. He added a link to my blog which is why I saw a large peak of Fellgernon Bit’s visitor data. After all, when doing the data analysis described above I played around with the data from Fellgernon Bit and now know that at a minimum posting drives visitor’s into sites (which sounds obvious, but maybe you get random traffic) —see fig 1 of the report.
Had Jeff done so before, I could have a point estimate (but without being able to say something about the uncertainty of it) that SimplyStatistics has 142 visitors that read the posts AND click on the links. Maybe using the info from Hilary’s and Alyssa’s blogs we could have an estimate with some measure of uncertainty, but only for March 8th.