<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>Welcome to Fellgernon Bit! It’s my academic corner in this bit-world to pretty much share stuff I find interesting. I’ll post my comments on news, science in general, Biostatistics, R &amp; Bioconductor, genomics, etc ^_^. 
Feel free to visit my academic website.</description><title>Fellgernon Bit</title><generator>Tumblr (3.0; @fellgernon)</generator><link>http://fellgernon.tumblr.com/</link><item><title>Reading an R file from GitHub</title><description>&lt;p&gt;Lets say that I want to read in &lt;a href="https://github.com/lcolladotor/ballgownR-devel/blob/master/ballgownR/R/infoGene.R"&gt;this R file&lt;/a&gt; from GitHub into R.&lt;/p&gt;
&lt;p&gt;The first thing you have to do is locate the raw file. You can do so by clicking on the &lt;strong&gt;Raw&lt;/strong&gt; button in GitHub. In this case it&amp;#8217;s &lt;a href="https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R"&gt;&lt;a href="https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R"&gt;https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R&lt;/a&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;One would think that using &lt;code&gt;source()&lt;/code&gt; would work, but it doesn&amp;#8217;t as shown below:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;source("https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R")
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: unsupported URL scheme
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Error: cannot open the connection
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, thanks again to Hadley Wickham you can do so by using the &lt;code&gt;devtools&lt;/code&gt; (&lt;span class="showtooltip" title="Wickham H and Chang W (2013). devtools: Tools to make developing R code easier. R package version 1.2."&gt;&lt;a href="http://CRAN.R-project.org/package=devtools"&gt;Wickham &amp;amp; Chang, 2013&lt;/a&gt;&lt;/span&gt; ) package.&lt;/p&gt;
&lt;p&gt;Here is how it works:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;library(devtools)
library(roxygen2)
## Needed because this file has roxygen2 comments. Otherwise you get a
## 'could not find function 'digest'' error
source_url("https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R")
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## SHA-1 hash of file is 6c32a620799eded5d6ff0997a184843d7964724a
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class="r"&gt;## Note that you can specify the SHA-1 hash to be very specific about
## which version of the file you want to read in.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can then check that &lt;code&gt;infoGene&lt;/code&gt; has actually been sourced:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;"infoGene" %in% ls()
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&amp;#8217;s it! Enjoy!&lt;/p&gt;
&lt;p&gt;Citations made with &lt;code&gt;knitcitations&lt;/code&gt; (&lt;span class="showtooltip" title="Boettiger C (2013). knitcitations: Citations for knitr markdown files. R package version 0.4-6."&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;Boettiger, 2013&lt;/a&gt;&lt;/span&gt; ).&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Hadley Wickham, Winston Chang, (2013) devtools: Tools to make developing R code easier. &lt;a href="http://CRAN.R-project.org/package=devtools"&gt;&lt;a href="http://CRAN.R-project.org/package=devtools"&gt;http://CRAN.R-project.org/package=devtools&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Carl Boettiger, (2013) knitcitations: Citations for knitr markdown files. &lt;a href="https://github.com/cboettig/knitcitations"&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;https://github.com/cboettig/knitcitations&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Reproducibility&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;sessionInfo()
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## R version 3.0.0 (2013-04-03)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] roxygen2_2.2.2      digest_0.6.3        devtools_1.2       
## [4] knitcitations_0.4-6 bibtex_0.3-5        knitr_1.2          
## 
## loaded via a namespace (and not attached):
##  [1] brew_1.0-6     evaluate_0.4.3 formatR_0.7    httr_0.2      
##  [5] memoise_0.1    parallel_3.0.0 RCurl_1.95-4.1 stringr_0.6.2 
##  [9] tools_3.0.0    whisker_0.3-2  XML_3.95-0.2   xtable_1.7-1
&lt;/code&gt;&lt;/pre&gt;</description><link>http://fellgernon.tumblr.com/post/50024045875</link><guid>http://fellgernon.tumblr.com/post/50024045875</guid><pubDate>Thu, 09 May 2013 14:03:30 -0400</pubDate><category>rstats</category><category>R</category><category>github</category></item><item><title>Join and participate in Biostats Social today!</title><description>&lt;p&gt;¡Hello everyone!&lt;/p&gt;
&lt;p&gt;I hope you had a great time this weekend, at the retreat or elsewhere. Now that the department is hyped up with plenty of new ideas and things to do for the self-study and beyond, I think that it&amp;#8217;s a great time to remind everyone about &lt;a href="http://tinyurl.com/biostats-social"&gt;Biostats Social&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;At it&amp;#8217;s bones, &lt;a href="http://tinyurl.com/biostats-social"&gt;Biostats Social&lt;/a&gt; is just a private Google Group. But overall it provides us with a space to share things, interact, and socialize with anyone linked to the Department (staff, faculty, postdocs, students of all flavors). It&amp;#8217;s this big melting pot of people that makes &lt;a href="http://tinyurl.com/biostats-social"&gt;Biostats Social&lt;/a&gt; a great place!&lt;/p&gt;
&lt;p&gt;&lt;span&gt;The Google Group infrastructure allows you to choose the mode that you like best: a mailing list, weekly email reports, no emails and thus browser-only access, etc. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The policy is simple: opt-in. To do so, go to &lt;a href="http://tinyurl.com/biostats-social"&gt;tinyurl.com/biostats-social&lt;/a&gt; and request to join (this helps us keep outsiders at bay).&lt;/p&gt;
&lt;p&gt;Best,&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Leonardo&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;PS This is the ad you might have seen in the billboards in the 3rd floor:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://tinyurl.com/biostats-social"&gt;&lt;img src="http://media.tumblr.com/6612b2127a35ca6eb99eb01e397a920d/tumblr_inline_mm0vobOJAG1qz4rgp.jpg"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&amp;#8212;&amp;#8212;&amp;#8212;&amp;#8212;-&lt;/p&gt;
&lt;p&gt;This invitation is exclusive to members of the JHSPH Biostatistics Department.&lt;/p&gt;

&lt;p&gt;&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/49181340090</link><guid>http://fellgernon.tumblr.com/post/49181340090</guid><pubDate>Mon, 29 Apr 2013 11:17:48 -0400</pubDate><category>Biostats</category><category>Social</category></item><item><title>Using plyr and doMC for quick and easy apply-family functions</title><description>&lt;p&gt;A few weeks back I dedicated a short amount of time to actually read what &lt;code&gt;plyr&lt;/code&gt; (&lt;span class="showtooltip" title="Wickham H (2011). The Split-Apply-Combine Strategy for Data
Analysis. _Journal of Statistical Software_, *40*(1), pp. 1-29.
 http://www.jstatsoft.org/v40/i01/."&gt;&lt;a href="http://www.jstatsoft.org/v40/i01/"&gt;Wickham, 2011&lt;/a&gt;&lt;/span&gt;) is about and I was surprised. The whole idea behind &lt;code&gt;plyr&lt;/code&gt; is very simple: expand the &lt;code&gt;apply()&lt;/code&gt; family to do things easy. &lt;code&gt;plyr&lt;/code&gt; has many functions whose name ends with &lt;code&gt;ply&lt;/code&gt; which is short of apply. Then, the functions are identified by two letters before &lt;code&gt;ply&lt;/code&gt; which are abbreviations for the input (first letter) and output (second one). For instance, &lt;code&gt;ddply&lt;/code&gt; takes an input a &lt;code&gt;data.frame&lt;/code&gt; and returns a &lt;code&gt;data.frame&lt;/code&gt; while &lt;code&gt;ldply&lt;/code&gt; takes as input a &lt;code&gt;list&lt;/code&gt; and returns a &lt;code&gt;data.frame&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The syntax is pretty straight forward. For example, here are the arguments for &lt;code&gt;ddply&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;library(plyr)
args(ddply)
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## function (.data, .variables, .fun = NULL, ..., .progress = "none", 
##     .inform = FALSE, .drop = TRUE, .parallel = FALSE, .paropts = NULL) 
## NULL
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What we basically have to specify are&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;code&gt;.data&lt;/code&gt; which in general is the name of the input &lt;code&gt;data.frame&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.variables&lt;/code&gt; which is a vector (note the use of the &lt;code&gt;.&lt;/code&gt; function) of variable names. In this case, &lt;code&gt;ddply&lt;/code&gt; is very useful for applying some function to subsets of the data as specified by these variables,&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.fun&lt;/code&gt; which is the actual function we want to run,&lt;/li&gt;
&lt;li&gt;and &lt;code&gt;...&lt;/code&gt; which are parameter options for the function we are running.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;From the &lt;code&gt;ddply&lt;/code&gt; help page we have the following examples:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;dfx &amp;lt;- data.frame(group = c(rep("A", 8), rep("B", 15), rep("C", 6)), sex = sample(c("M", 
    "F"), size = 29, replace = TRUE), age = runif(n = 29, min = 18, max = 54))

# Note the use of the '.' function to allow group and sex to be used
# without quoting
ddply(dfx, .(group, sex), summarize, mean = round(mean(age), 2), sd = round(sd(age), 
    2))
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   group sex  mean    sd
## 1     A   F 40.48 12.72
## 2     A   M 34.48 15.28
## 3     B   F 36.05  9.98
## 4     B   M 38.35  7.97
## 5     C   F 20.04  1.86
## 6     C   M 43.81 10.72
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class="r"&gt;
# An example using a formula for .variables
ddply(baseball[1:100, ], ~year, nrow)
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   year V1
## 1 1871  7
## 2 1872 13
## 3 1873 13
## 4 1874 15
## 5 1875 17
## 6 1876 15
## 7 1877 17
## 8 1878  3
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class="r"&gt;# Applying two functions; nrow and ncol
ddply(baseball, .(lg), c("nrow", "ncol"))
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   lg  nrow ncol
## 1       65   22
## 2 AA   171   22
## 3 AL 10007   22
## 4 FL    37   22
## 5 NL 11378   22
## 6 PL    32   22
## 7 UA     9   22
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But this is not the end of the story! Something I really liked about &lt;code&gt;plyr&lt;/code&gt; is that it can be parallelized via the &lt;code&gt;foreach&lt;/code&gt; (&lt;span class="showtooltip" title="Analytics R (2012). _foreach: Foreach looping construct for R_. R
package version 1.4.0, 
http://CRAN.R-project.org/package=foreach."&gt;&lt;a href="http://CRAN.R-project.org/package=foreach"&gt;Analytics, 2012&lt;/a&gt;&lt;/span&gt;) package. I don&amp;#8217;t know much about &lt;code&gt;foreach&lt;/code&gt;, but all I learnt is that you have to use other packages such as &lt;code&gt;doMC&lt;/code&gt; (&lt;span class="showtooltip" title="Analytics R (2013). _doMC: Foreach parallel adaptor for the
multicore package_. R package version 1.3.0, 
http://CRAN.R-project.org/package=doMC."&gt;&lt;a href="http://CRAN.R-project.org/package=doMC"&gt;Analytics, 2013&lt;/a&gt;&lt;/span&gt;) to actually run the code. It&amp;#8217;s like &lt;code&gt;foreach&lt;/code&gt; specifies the infraestructure to communicate in parallel (and split jobs) and packages like &lt;code&gt;doMC&lt;/code&gt; tailor it for specific environments like for running in multi-core.&lt;/p&gt;
&lt;p&gt;Running things in parallel can then be very easy. Basically, you load the packages, specify the number of cores, and run your &lt;code&gt;ply&lt;/code&gt; function. Here is a short example:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;## Load packages
library(plyr)
library(doMC)
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: foreach
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: iterators
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: parallel
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class="r"&gt;
## Specify the number of cores
registerDoMC(4)

## Check how many cores we are using
getDoParWorkers()
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 4
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class="r"&gt;
## Run your ply function
ddply(dfx, .(group, sex), summarize, mean = round(mean(age), 2), sd = round(sd(age), 
    2), .parallel = TRUE)
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##   group sex  mean    sd
## 1     A   F 40.48 12.72
## 2     A   M 34.48 15.28
## 3     B   F 36.05  9.98
## 4     B   M 38.35  7.97
## 5     C   F 20.04  1.86
## 6     C   M 43.81 10.72
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In case that you are interested, here is a short shell script for knitting an Rmd file in the cluster and specifying the appropriate number of cores to then use &lt;code&gt;plyr&lt;/code&gt; and &lt;code&gt;doMC&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code class="bash"&gt;#!/bin/bash 
# To run it in the current working directory
#$ -cwd 
# To get an email after the job is done
#$ -m e 
# To speficy that we want 4 cores
#$ -pe local 4
# The name of the job
#$ -N myPlyJob

echo "**** Job starts ****"
date

# Knit your file: assuming it's called FileToKnit.Rmd
Rscript -e "library(knitr); knit2html('FileToKnit.Rmd')"

echo "**** Job ends ****"
date
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Lets say that the bash script is named &lt;code&gt;script.sh&lt;/code&gt;. Then you can submit it to the cluster queue using&lt;/p&gt;
&lt;pre&gt;&lt;code class="bash"&gt;qsub script.sh
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is what I used to re-format a large &lt;code&gt;data.frame&lt;/code&gt; in a few minutes in the cluster for the &lt;a href="https://twitter.com/search?q=%23jhsph753&amp;amp;src=typd"&gt;#jhsph753&lt;/a&gt; class homework project.&lt;/p&gt;
&lt;p&gt;So, thank you again &lt;a href="https://twitter.com/hadleywickham"&gt;Hadley Wickham&lt;/a&gt; for making awesome R packages!&lt;/p&gt;
&lt;p&gt;Citations made with &lt;code&gt;knitcitations&lt;/code&gt; (&lt;span class="showtooltip" title="Boettiger C (2013). _knitcitations: Citations for knitr markdown
files_. R package version 0.4-4, 
https://github.com/cboettig/knitcitations."&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;Boettiger, 2013&lt;/a&gt;&lt;/span&gt;).&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Revolution Analytics, (2013) doMC: Foreach parallel adaptor for the multicore package. &lt;a href="http://CRAN.R-project.org/package=doMC"&gt;&lt;a href="http://CRAN.R-project.org/package=doMC"&gt;http://CRAN.R-project.org/package=doMC&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Revolution Analytics, (2012) foreach: Foreach looping construct for R. &lt;a href="http://CRAN.R-project.org/package=foreach"&gt;&lt;a href="http://CRAN.R-project.org/package=foreach"&gt;http://CRAN.R-project.org/package=foreach&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Carl Boettiger, knitcitations: Citations for knitr markdown files. &lt;a href="https://github.com/cboettig/knitcitations"&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;https://github.com/cboettig/knitcitations&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Hadley Wickham, (2011) The Split-Apply-Combine Strategy for Data Analysis. &lt;em&gt;Journal of Statistical Software&lt;/em&gt; &lt;strong&gt;40&lt;/strong&gt; (1) &lt;a href="http://www.jstatsoft.org/v40/i01/"&gt;&lt;a href="http://www.jstatsoft.org/v40/i01/"&gt;http://www.jstatsoft.org/v40/i01/&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://fellgernon.tumblr.com/post/48941418303</link><guid>http://fellgernon.tumblr.com/post/48941418303</guid><pubDate>Fri, 26 Apr 2013 14:14:17 -0400</pubDate><category>rstats</category><category>R</category><category>plyr</category><category>parallel</category><category>knitr</category><category>cluster</category></item><item><title>Epi vs Biostat Kickball match Spring 2013</title><description>&lt;p&gt;This past Saturday the Epi and Biostat troops met for another fun kickball match. Obviously Biostat beat Epi, yup I know: again! This time the score was 15-8 (according to our bookkeeper and captain John) or 12-8 (according to some in Epi).&lt;/p&gt;
&lt;p&gt;There was a hint of a surprise at the beginning when Epi scored two runs in the top of the first inning. However, the tide changed back with a homerun by Rumen. Sadly, one of the Epi players got injured and carried out of the court in that play. Rumen also pulled his quad with the big hit and was limited for the rest of the match.&lt;/p&gt;
&lt;p&gt;From that inning on forth we saw both teams having fun kicking the ball as far as we could or aim for in between the defensive lines. There were plenty of sacrifice hits, some occasional errors, but overall we had a lot of fun!&lt;/p&gt;
&lt;p&gt;&lt;img alt="Both teams picture" src="http://biostat.jhsph.edu/%7Elcollado/misc/Kickball2013/images/2013_04_20_16_55_15.jpg"/&gt;&lt;/p&gt;
&lt;p&gt;Both teams came prepared to show their colors: them in red us in purple with some face paint for the sport battle (thanks to Aaron). However, the Epi crew did surprise us by bringing a big grill to the park and lots of food!&lt;/p&gt;
&lt;p&gt;At the end of the match, we all mingled together and enjoyed the nice (a bit chilly) day outside in the company of some drinks and food.&lt;/p&gt;
&lt;p&gt;Some of us then continued our journey at Kislings where we played other games that involve loads of cups and some ping pong balls ;)&lt;/p&gt;
&lt;p&gt;You can &lt;a href="http://biostat.jhsph.edu/%7Elcollado/misc/Kickball2013/index.html"&gt;view all the pictures here&lt;/a&gt;. If you have any other pictures that you want to share, send them my way!&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/48779184528</link><guid>http://fellgernon.tumblr.com/post/48779184528</guid><pubDate>Wed, 24 Apr 2013 12:00:06 -0400</pubDate><category>Epi</category><category>Biostat</category><category>Kickball</category></item><item><title>Fluid</title><description>&lt;p&gt;While I was looking for a Google Tasks app for the Mac, I found the following &lt;a href="http://www.quora.com/OS-X-Applications/What-is-the-best-way-to-manage-Google-Tasks-on-Mac-OS-X"&gt;Quora thread&lt;/a&gt; where Dave Thompson suggests trying out &lt;a href="http://www.fluidapp.com/"&gt;Fluid&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;iframe frameborder="0" height="300" src="http://player.vimeo.com/video/22820843?title=0&amp;amp;byline=0&amp;amp;portrait=0" width="400"&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s free and super simple to use and with it I create apps for some of the sites that I visit frequently like Gmail. &lt;br/&gt;&lt;br/&gt;One advantage I see is that I can now look on a specific desktop my Gmail (right now I&amp;#8217;m using 8) and help me organize my work. Otherwise, whenever I open Google Chrome I have to manually move each window to the desktop I want it to be in. &lt;span&gt;The other way of doing this is by using separate browsers for different pages, but then if you want to find something in your browsing history it becomes a mess. Plus, I like the idea of having a tab-less window for Gmail. The cherry on the pie is that you choose your favorite icon for each application.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;To illustrate how I separate things, my desktops are currently organized like this:&lt;/p&gt;
&lt;ol&gt;&lt;li&gt;Miscellaneous browsing, reading, etc.&lt;/li&gt;
&lt;li&gt;Homework (either Statistical Theory or Probability Theory)&lt;/li&gt;
&lt;li&gt;Methods (see #jhsph753)&lt;/li&gt;
&lt;li&gt;Research&lt;/li&gt;
&lt;li&gt;Music&lt;/li&gt;
&lt;li&gt;Comprehensive exam studying&lt;/li&gt;
&lt;li&gt;Gmail, Google Reader, Calendar and Twitter&lt;/li&gt;
&lt;li&gt;Tumblr, facebook, &amp;#8230;&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;Without Fluid, that involves moving many browser windows around.&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/48068777698</link><guid>http://fellgernon.tumblr.com/post/48068777698</guid><pubDate>Mon, 15 Apr 2013 17:54:01 -0400</pubDate><category>Fluid</category><category>Mac</category></item><item><title>Laptop fixed =)</title><description>&lt;p&gt;I just want to thank everyone that gave me ideas of what to try and whom to ask for solving the issue I was having with my laptop.&lt;/p&gt;
&lt;p&gt;For future reference and to complement my &lt;a href="http://fellgernon.tumblr.com/post/47680956215/need-some-help-fixing-my-mac#.UWs_gSvF0b0"&gt;previous post&lt;/a&gt;, here&amp;#8217;s a list of other things I tried.&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;span&gt;I ran a memory test using &lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;a href="http://osxdaily.com/2011/05/03/memtest-mac-ram-test/"&gt;memtest from here&lt;/a&gt;&lt;span&gt;. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;I updated my OS-X to 10.8.3 from 10.7.5, and although it froze the first time it was installing, it did pick up from where it had left after rebooting and finished successfully the second time. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;The Mac IT support at the university ran a hidden test (I think that it&amp;#8217;s from booting up by pressing alt and then another shortcut which I missed) that verified that the memory and disk are fine. He did notice that I&amp;#8217;m eligible for a battery replacement, which I&amp;#8217;ll apply for soon and use the AppleCare Protection Plan. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;Changed the HD back to the original one to boot up (up to the recovery screen which you can get to using cmd + R when booting up), and then changed back to my SSD HD. The hope was to re-establish the OS-drive connection (or whatever it&amp;#8217;s called).&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;I ran a file system check (fsck -fy) in single user mode with the Mac IT person. I had done one before, but this one was after upgrading the OS. Plus, he told me it&amp;#8217;s slightly different from the check using the Disk Utility. More &lt;a href="http://reviews.cnet.com/8301-13727_7-20028609-263.html"&gt;here&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;What worked?&lt;/p&gt;
&lt;p&gt;I finally updated the firmware version of my Crucial m4&amp;#160;2.5 inch SSD drive (&lt;a href="http://www.crucial.com/support/ssd/index.aspx?source=web&amp;amp;cpe=m4firmware_us"&gt;available here&lt;/a&gt;) to version 070H from version 009 using a &lt;a href="http://forum.crucial.com/t5/Solid-State-Drives-SSD/How-to-update-M4-SSD-firmware-for-Mac-Os-X-users/td-p/59000"&gt;bootable CD&lt;/a&gt;. I tried multiple ways to do so using a bootable USB but failed and eventually bought a CD to try this option. I do remember upgrading from 002 to 009 using a bootable USB, but I think that I made it with a PC and not a Mac. Crucial has to seriously improve the instructions on how to upgrade the firmware using a bootable USB for Mac users!!&lt;/p&gt;
&lt;p&gt;Anyhow, I wasn&amp;#8217;t expecting this to work but it did. According to the description, the 070H version was made to solve bugs for Windows 8 users. What I think that made it work is that it re-made the OS-drive connection (or whoever it&amp;#8217;s called) as the Mac IT guy said. The next options were to reformat the disk completely and complain abou the drive with the manufacturer.&lt;/p&gt;
&lt;p&gt;So, now I&amp;#8217;m keeping that 070H firmware disk in case I have to do something like this again.&lt;/p&gt;
&lt;p&gt;And&amp;#8230; now I have Mountain Lion :P and a RAM upgrade on the mail.&lt;/p&gt;
&lt;p&gt;Thanks again everyone for the help!&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/47998882302</link><guid>http://fellgernon.tumblr.com/post/47998882302</guid><pubDate>Sun, 14 Apr 2013 19:59:00 -0400</pubDate><category>Mac</category><category>Problem</category><category>Freeze</category><category>RAM</category><category>help</category></item><item><title>Need some help fixing my Mac...</title><description>&lt;p&gt;2 days ago I was writing some R code for &lt;a href="https://twitter.com/search?q=%23jhsph753&amp;amp;src=typd"&gt;#jhsph753&lt;/a&gt; in an Rmd file. I was careless and didn&amp;#8217;t realize that one computation would be very RAM intensive until I was running &lt;a href="https://twitter.com/search?q=%23knitr&amp;amp;src=typd"&gt;#knitr&lt;/a&gt; (as a &amp;#8216;silent&amp;#8217; process from &lt;a href="https://github.com/textmate/textmate"&gt;TextMate2&lt;/a&gt;). My computer started swapping and became unresponsive, without being able to force quit. So I did a hard restart: aka, I shut it down by holding the power button.&lt;/p&gt;
&lt;p&gt;The problem is that since that happened, my MacBook Pro Early 2011 &lt;span&gt;version&lt;/span&gt;&lt;span&gt; with a 2.7Ghz i7 procession freezes every 40-80 min. It&amp;#8217;s the &lt;a href="http://mmlweb.rutgers.edu/music127/basic/crash_freeze.htm"&gt;bad kind of freeze&lt;/a&gt; because the mouse becomes unresponsive and all the usual force quit hotkeys don&amp;#8217;t work. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;From what I understand, that first time that it froze 2 days ago, my laptop started swapping but after the hard restart some of the swap info might have become corrupted or something related to this broke.&lt;/p&gt;
&lt;p&gt;I have tried &lt;a href="http://support.apple.com/kb/ht1379"&gt;resetting the NVRAM&lt;/a&gt;, &lt;a href="http://support.apple.com/kb/ht3964"&gt;resetting the System Management Controller (SMC)&lt;/a&gt;, performed successful &lt;a href="http://support.apple.com/kb/ht1782"&gt;verify/repair disk&lt;/a&gt;** and verify/repair disk permissions, booted in safe mode, ran the OS-X &lt;a href="http://www.thexlab.com/faqs/maintscripts.html"&gt;maintenance scripts manually&lt;/a&gt;, and finally deleted the sleepimage and swapfile* in /private/var/vm Note that I&amp;#8217;m not trying to &lt;a href="http://forums.macrumors.com/showthread.php?t=1480259"&gt;block the OS from writing the 8Gb sleepimage file as others have done&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So basically, I tried all the tricks I found via Google or that have so far been recommended to me.&lt;/p&gt;
&lt;p&gt;I also learnt about purging (&lt;a href="http://www.macupdate.com/app/mac/45304/ram-cleaning"&gt;either with an app &lt;/a&gt;or &lt;a href="http://osxdaily.com/2012/04/24/free-up-inactive-memory-in-mac-os-x-with-purge-command/"&gt;just manually&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Despite of all these ideas failing, I still think that it&amp;#8217;s a software issue. Maybe the disk (SSD; these are the details: CT1895178&amp;#160;256GB Crucial m4&amp;#160;2.5-inch SATA 6GB/s) is corrupted but for now it does pass the Disk Utility check.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m also considering updating the OS from 10.7.5 to Mountain Lion hoping for the best. But maybe that&amp;#8217;s too hopeful.&lt;/p&gt;
&lt;p&gt;Or maybe it&amp;#8217;s RAM itself? Dunno, but this whole thing prompted me to buy an upgrade which I&amp;#8217;ll install in a couple days.&lt;/p&gt;
&lt;p&gt;But regardless of the OS/RAM upgrades, now I have to deal with the frequent freezes and hope that there is another solution out there that I haven&amp;#8217;t found/tried. So, if you have an idea, please let me know!&lt;/p&gt;
&lt;p&gt;I find it rather&amp;#8230; well, upsetting that a rather simple issue (using too much RAM -&amp;gt; swapping -&amp;gt; freezing -&amp;gt; hard reset) would cause other stuff to break so easily.&lt;/p&gt;

&lt;p&gt;** It did repair something the first time, but hasn&amp;#8217;t found anything to repair after the multiple hard resets that have followed. This is the part of the Disk Utility log of the &amp;#8220;disk verify&amp;#8221; that shows the error that was then fixed by &amp;#8220;disk repair&amp;#8221;.&lt;/p&gt;
&lt;p&gt;2013-04-09&amp;#160;13:48:47 -0400: Checking volume bitmap.&lt;/p&gt;
&lt;p&gt;2013-04-09&amp;#160;13:48:47 -0400: Volume bitmap needs minor repair for orphaned blocks&lt;/p&gt;
&lt;p&gt;2013-04-09&amp;#160;13:48:47 -0400: Checking volume information.&lt;/p&gt;
&lt;p&gt;2013-04-09&amp;#160;13:48:47 -0400: Invalid volume free block count&lt;/p&gt;
&lt;p&gt;2013-04-09&amp;#160;13:48:47 -0400: (It should be 15854995 instead of 15763025)&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/47680956215</link><guid>http://fellgernon.tumblr.com/post/47680956215</guid><pubDate>Thu, 11 Apr 2013 00:42:00 -0400</pubDate><category>Mac</category><category>Problem</category><category>Freeze</category><category>RAM</category><category>help</category></item><item><title>Have you been 'relative stupid'?</title><description>&lt;p&gt;I enjoyed reading &amp;#8220;&lt;a href="http://jcs.biologists.org/content/121/11/1771.full"&gt;The importance of stupidity in scientific research&lt;/a&gt;&amp;#8221; by Martin A. Schwartz which I learned existed through &lt;a href="https://twitter.com/hmason"&gt;@hmason&lt;/a&gt; and &lt;a href="https://twitter.com/simplystats"&gt;@simplystats&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;I found the point of how it&amp;#8217;s normal to feel stupid in academia and specially in Ph.D. programs to be illuminating. But Schwartz clarifies that there are other kinds of stupid:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;we don&amp;#8217;t do a good enough job of teaching our students how to be productively stupid – that is, if we don&amp;#8217;t feel stupid it means we&amp;#8217;re not really trying. I&amp;#8217;m not talking about `relative stupidity&amp;#8217;, in which the other students in the class actually read the material, think about it and ace the exam, whereas you don&amp;#8217;t. I&amp;#8217;m also not talking about bright people who might be working in areas that don&amp;#8217;t match their talents. Science involves confronting our `absolute stupidity&amp;#8217;. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I don&amp;#8217;t know about you, but I have certainly been &amp;#8216;relative stupid&amp;#8217; at times. &lt;/p&gt;
&lt;p&gt;And yes, we have to confront our &amp;#8216;absolute stupidity&amp;#8217;. But to me, graduate school is also about learning how to be super efficient with your time. That implies being highly organized, learning how to canalize your distractions, and finding sources of constant motivation. For example, I now read more stats/R/research blogs as part of my set of distractions and have considerably decreased how many sport news I read. &lt;/p&gt;
&lt;p&gt;I also struggle with the internal challenge of doing great at school, but then also &amp;#8216;having a life&amp;#8217;. So yes, at times I have been &amp;#8216;relative stupid&amp;#8217; but also had a great time. After all, I no longer need to &amp;#8216;ace&amp;#8217; all my exams.&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/46781594476</link><guid>http://fellgernon.tumblr.com/post/46781594476</guid><pubDate>Sun, 31 Mar 2013 15:47:16 -0400</pubDate><category>GSL</category><category>relative</category><category>stupid</category><category>academia</category></item><item><title>"Do analytics really tell the whole story?"</title><description>&lt;p&gt;&amp;#8220;&lt;a href="http://www.packers.com/news-and-events/article-1/Do-analytics-really-tell-the-whole-story/86248baa-e8ec-4772-a0df-7693676812be?campaign=FB130330"&gt;Do analytics really tell the whole story?&lt;/a&gt;&amp;#8221; by Vic Ketchman explores how analytics is used nowadays in the NFL draft. The entry point is the &amp;#8220;Moneyball&amp;#8221; movie and Ketchman&amp;#8217;s piece is mainly a digested interview to Tony Villiotti from draftmetrics.com&lt;/p&gt;
&lt;p&gt;According to him:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What is analytics? It’s the accumulation of meaningful patterns in data, for the purpose of using that data to predict future results.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I&amp;#8217;m not a fan of the wording used, but well, the point they make is that they use data to predict the future.&lt;/p&gt;
&lt;p&gt;My main issue with this article is that after the previous quote Ketchman pretty much describes some of the data. Description of the data—in my opinion—is part of what we call EDA: Exploratory Data Analysis. &lt;span&gt;The data is interesting, but there are really not many predictions made.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m also concerned by how some of the data is presented. For example, is the 37.1 percent rate of starts by first-round picks really different form 35.5 for the teams with losing records? Plus, it&amp;#8217;s data from only a single year! So I think that it&amp;#8217;s not enough to actually answer any question.&lt;/p&gt;
&lt;p&gt;To end my comment, Ketchman asks:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;How do you like those analytics?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I don&amp;#8217;t like them much. Sure, some of numbers presented are interesting but the &amp;#8216;analytics&amp;#8217; are far from being great. &lt;span&gt;Though I bet Villiotti has more interesting results that are only seen by the NFL teams.&lt;/span&gt;&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/46779711594</link><guid>http://fellgernon.tumblr.com/post/46779711594</guid><pubDate>Sun, 31 Mar 2013 15:23:44 -0400</pubDate><category>NFL</category><category>analytics</category></item><item><title>Great commentary on sequestration's impact on research! National media should talk about this and YOU should read it!!!</title><description>&lt;p&gt;Today &lt;a href="http://www.biostat.jhsph.edu/%7Ejleek/"&gt;Jeffrey T. Leek&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Steven_Salzberg"&gt;Steven L. Salzberg&lt;/a&gt; published a paper commentary in Genome Biology today titled “&lt;a href="http://genomebiology.com/2013/14/3/109"&gt;Sequestration: inadvertently killing biomedical research to score political points&lt;/a&gt;” (&lt;span class="showtooltip" title="Leek J and Salzberg S (2013). Sequestration: Inadvertently  Killing Biomedical Research to Score Political Points. _Genome  Biology_, *14*. ISSN 1465-6906,   http://dx.doi.org/10.1186/gb-2013-14-3-109."&gt;&lt;a href="http://dx.doi.org/10.1186/gb-2013-14-3-109"&gt;Leek &amp;amp; Salzberg, 2013&lt;/a&gt;&lt;/span&gt;) which I think is a &lt;strong&gt;must read for anyone&lt;/strong&gt;. Seriously!&lt;/p&gt;
&lt;p&gt;I do not mean &lt;em&gt;anyone involved in research&lt;/em&gt;, or all scientists. I mean, this commentary should be in the &lt;strong&gt;national media&lt;/strong&gt;. &lt;strong&gt;Why&lt;/strong&gt;?&lt;/p&gt;
&lt;p&gt;Well, let me approach the technical side first. You might think that anything that appears in a scientific journal—despite any efforts to make it accessible to the general public—will rely on words whose meaning is mostly only understood by scientists. That is not the case in this commentary: it is a dual letter meant to be read by those in Congress, but it is also an educational commentary for the general public.&lt;/p&gt;
&lt;p&gt;The main reason why &lt;strong&gt;you&lt;/strong&gt; should be reading this commentary is that the consequences of the &amp;#8216;sequester&amp;#8217; are going to affect &lt;strong&gt;you&lt;/strong&gt;. So if you are interested in your future and the well-being of those who you care for, then you should read it. And if you don&amp;#8217;t know what the sequester is and how it will impact research, well, that&amp;#8217;s another reason why you should read this commentary. Plus you might want to look at this (serious) comic from &lt;code&gt;PhD comics&lt;/code&gt; (&lt;span class="showtooltip" title="Cham J (2013). U.S. Budget Sequestration Explained.   http://www.phdcomics.com/comics.php?f=1561."&gt;&lt;a href="http://www.phdcomics.com/comics.php?f=1561"&gt;© Cham, 2013&lt;/a&gt;&lt;/span&gt;) to get an overall idea. Note that it was published before sequestration hit in.&lt;/p&gt;
&lt;p&gt;&lt;img alt="PhD comic on the sequester" src="http://www.phdcomics.com/comics/archive/phd021513s.gif"/&gt;&lt;/p&gt;
&lt;p&gt;Going back to the commentary piece by Leek and Salzberg, I can imagine someone refuting like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hey, but I don&amp;#8217;t live in the United States so it doesn&amp;#8217;t affect me.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That is true in a sense because you will likely be affected by your own country&amp;#8217;s policies more directly, and specially in policy topics that have short term impact. Nevertheless, any breakthrough made by U.S.-based research for the most part (aka, when politics doesn&amp;#8217;t get in the way) will reach you. After all, Leek and Salzberg cite (&lt;span class="showtooltip" title="(2013). The Sequester Is Going to Devastate U.S. Science Research  for Decades.   http://www.theatlantic.com/politics/archive/2013/03/the-sequester-is-going-to-devastate-us-science-research-for-decades/273925/  [Online. last-accessed: 2013-03-28 03:33:46].   http://www.theatlantic.com/politics/archive/2013/03/the-sequester-is-going-to-devastate-us-science-research-for-decades/273925/."&gt;&lt;a href="http://www.theatlantic.com/politics/archive/2013/03/the-sequester-is-going-to-devastate-us-science-research-for-decades/273925/"&gt;Alivisatos et al in The Atlantic, 2013&lt;/a&gt;&lt;/span&gt;) where the following statement is made:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Nobel Prize-winning economist Robert Solow has &lt;a href="http://magazine.amstat.org/blog/2011/03/01/econgrowthmar11/"&gt;calculated&lt;/a&gt; that over the past half century, more than half of the growth in our nation&amp;#8217;s GDP has been rooted in scientific discoveries – the kinds of fundamental, mission-driven research that we do at the labs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The claim that it affects other countries is just a generalization of the previous result and what I would consider some common sense. If this is not enough to attract your interest, then you should take a look at Salzberg&amp;#8217;s previous comment “&lt;a href="http://genome.fieldofscience.com/2013/03/a-breakthrough-cure-for-acute-leukemia.html"&gt;A breakthrough cure for acute leukemia?&lt;/a&gt;” that is a showcase example of successful biomedical research funded by the same institutions being hit by sequestration.&lt;/p&gt;
&lt;p&gt;I hope to have convinced you to read Leek and Salzberg&amp;#8217;s commentary by now. So let me talk a little bit about the things that I liked the most.&lt;/p&gt;
&lt;p&gt;Most of all, I like the tone they used because this is not a silly matter and while it may sound as alarming as &lt;a href="http://en.wikipedia.org/wiki/The_Boy_Who_Cried_Wolf"&gt;the boy who cried wolf&lt;/a&gt;, the reality is that the wolf does exist and will visit you. So while no visible effects have been seen from the sequester this month, that doesn&amp;#8217;t mean that you can just ignore this problem. It is like when you throw a stone in calm water: just a few small ripples are seen at the beginning, but they reach far away. In other words, it will take some time to actually feel the negative effects.&lt;/p&gt;
&lt;p&gt;Overall, I consider Leek and Salzberg&amp;#8217;s work a &lt;strong&gt;wake up call&lt;/strong&gt; to politicians and &lt;strong&gt;you&lt;/strong&gt;. Either &lt;em&gt;you&lt;/em&gt; the researcher, but most importantly, &lt;em&gt;you&lt;/em&gt; the citizen who cares about the future.&lt;/p&gt;
&lt;p&gt;Some, specially those who are major supporters of military programs, might disagree with the whole comparison of the F-35 plane which has an estimated cost of $400 billion to the National Institutes of Health (NIH) annual budget of around $31 billion (&lt;span class="showtooltip" title="Leek J and Salzberg S (2013). Sequestration: Inadvertently  Killing Biomedical Research to Score Political Points. _Genome  Biology_, *14*. ISSN 1465-6906,   http://dx.doi.org/10.1186/gb-2013-14-3-109."&gt;&lt;a href="http://dx.doi.org/10.1186/gb-2013-14-3-109"&gt;Leek &amp;amp; Salzberg, 2013&lt;/a&gt;&lt;/span&gt;). But to me this is just incredible!&lt;/p&gt;
&lt;p&gt;&lt;strike&gt;To end my comments, I have to say that I am surprised that Leek and Salzberg&amp;#8217;s commentary is behind a paywall. I thought that it would be an open-access piece. After all research articles in Genome Biology are open-access, but this is a commentary so it is not considered a research article. To their credit, Genome Biology does offer 30-day free trial subscriptions. But I am afraid that Leek and Salzberg will lose many readers due to this reason. Hopefully, &lt;strong&gt;you&lt;/strong&gt; will feel motivated enough to go through the whole trial subscription process, or maybe Genome Biology will make an exception for this commentary.&lt;/strike&gt;&lt;/p&gt;
&lt;p&gt;**Update: Genome Biology changed Leek &amp;amp; Salzberg&amp;#8217;s commentary so &lt;span&gt;as of March 28th &lt;/span&gt;&lt;span&gt;it is now open-access (I wrote the post late on the 27th).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Finally, are you not incredulous to see this situation happen? Shouldn&amp;#8217;t the debate be about spending more money in research now that what was spent in the past? The whole sequestration topic is alarming, but the fact that the budget for research hasn&amp;#8217;t increased in years is &lt;strong&gt;shocking&lt;/strong&gt;. Oh wait, you are giving Mexico a chance to catch up to the mighty U.S. in research!!! The whole talk in Mexico about catching up with Brazil or India should be about the U.S. now! (Sadly, Mexico has a lot of catching up to do…)&lt;/p&gt;
&lt;p&gt;Citations made with &lt;code&gt;knitcitations&lt;/code&gt; (&lt;span class="showtooltip" title="Boettiger C (2013). _knitcitations: Citations for knitr markdown  files_. R package version 0.4-4,   https://github.com/cboettig/knitcitations."&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;Boettiger, 2013&lt;/a&gt;&lt;/span&gt;) and the post was written in the Rmd format powered by &lt;code&gt;knitr&lt;/code&gt; (&lt;span class="showtooltip" title="Xie Y (2013). _knitr: A general-purpose package for dynamic report  generation in R_. R package version 1.1,   http://CRAN.R-project.org/package=knitr."&gt;&lt;a href="http://CRAN.R-project.org/package=knitr"&gt;Xie, 2013&lt;/a&gt;&lt;/span&gt;).&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;(2013) The Sequester Is Going to Devastate U.S. Science Research for Decades. &lt;em&gt;The Atlantic&lt;/em&gt; &lt;a href="http://www.theatlantic.com/politics/archive/2013/03/the-sequester-is-going-to-devastate-us-science-research-for-decades/273925/"&gt;&lt;a href="http://www.theatlantic.com/politics/archive/2013/03/the-sequester-is-going-to-devastate-us-science-research-for-decades/273925/"&gt;http://www.theatlantic.com/politics/archive/2013/03/the-sequester-is-going-to-devastate-us-science-research-for-decades/273925/&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Carl Boettiger, knitcitations: Citations for knitr markdown files. &lt;a href="https://github.com/cboettig/knitcitations"&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;https://github.com/cboettig/knitcitations&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Yihui Xie, (2013) knitr: A general-purpose package for dynamic report generation in R. &lt;a href="http://CRAN.R-project.org/package=knitr"&gt;&lt;a href="http://CRAN.R-project.org/package=knitr"&gt;http://CRAN.R-project.org/package=knitr&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Jorge Cham, U.S. Budget Sequestration Explained. &lt;a href="http://www.phdcomics.com/comics.php?f=1561"&gt;&lt;a href="http://www.phdcomics.com/comics.php?f=1561"&gt;http://www.phdcomics.com/comics.php?f=1561&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Jeffrey T Leek, Steven L Salzberg, (2013) Sequestration: Inadvertently Killing Biomedical Research to Score Political Points. &lt;em&gt;Genome Biology&lt;/em&gt; &lt;strong&gt;14&lt;/strong&gt; &lt;a href="http://dx.doi.org/10.1186/gb-2013-14-3-109"&gt;10.1186/gb-2013-14-3-109&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://fellgernon.tumblr.com/post/46483321621</link><guid>http://fellgernon.tumblr.com/post/46483321621</guid><pubDate>Thu, 28 Mar 2013 00:25:00 -0400</pubDate><category>Sequester</category><category>Sequestration</category><category>WakeUp</category><category>MustRead</category><category>Paper comments</category><category>Research</category></item><item><title>New Citi ThankYou ballon TV commercial looks like a ripoff of Hello Kitty in Space</title><description>&lt;p&gt;I don&amp;#8217;t know about you, but I think that this new &amp;#8220;Citi ThankYou Cards&amp;#8221; TV commercial is trying to ride the popularity train from the &amp;#8220;HELLO KITTY IN SPACE&amp;#8221; video.&lt;/p&gt;
&lt;p&gt;&lt;iframe frameborder="0" height="315" src="http://www.youtube.com/embed/5qZQ0POmpkM?rel=0" width="560"&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;&lt;iframe frameborder="0" height="315" src="http://www.youtube.com/embed/5REsCTG4-Gg?rel=0" width="560"&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;Hm&amp;#8230; it looks like a ripoff, smells like a ripoff, tastes like a ripoff&amp;#8230; is it a ripoff?&lt;/p&gt;

&lt;p&gt;Maybe it&amp;#8217;s just flattery, maybe it&amp;#8217;s imitation, or maybe it&amp;#8217;s copyright infringement. What do you think?&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/46462799294</link><guid>http://fellgernon.tumblr.com/post/46462799294</guid><pubDate>Wed, 27 Mar 2013 20:16:00 -0400</pubDate><category>Citi</category><category>Hello</category><category>Kitty</category><category>Ballon</category><category>ripoff</category><category>TV</category><category>Commercial</category><category>Space</category></item><item><title>Predicting who will win a NFL match at half time</title><description>&lt;p&gt;It was great to have a little break, &lt;em&gt;Spring break&lt;/em&gt;, although the weather didn&amp;#8217;t feel like spring at all! During the early part of the break I worked on my final project for Jeff Leek&amp;#8217;s data analysis class, which we call 140.753 here. Continuing &lt;a href="http://fellgernon.tumblr.com/tagged/jhsph753#.UU44Y1vF2c4"&gt;my previous posts on the topic&lt;/a&gt;, this time I&amp;#8217;ll share the results of my final project.&lt;/p&gt;
&lt;p&gt;At the beginning of the course, we had to submit a project plan (more like a proposal) and &lt;a href="https://github.com/lcolladotor/lcollado753/blob/master/hw/projectplan/lcollado_projectplan.pdf"&gt;in mine&lt;/a&gt; I announced my interest to look into some sports data. At the time I included a few links to Brian Burke&amp;#8217;s Advanced NFL Stats site (&lt;span class="showtooltip" title="(2013). Advanced NFL Stats.   http://www.advancednflstats.com/ [Online. last-accessed:  2013-03-23 23:28:38].  http://www.advancednflstats.com/."&gt;&lt;a href="http://www.advancednflstats.com/"&gt;Burke&lt;/a&gt;&lt;/span&gt;). At the time I didn&amp;#8217;t know that Burke&amp;#8217;s site described in detail a lot of the information I would end up using.&lt;/p&gt;
&lt;p&gt;My final project had to do with splitting NFL games by half and then use only the play-by-play data from the first half to predict if team A or B would win the game. My overall goal was to have some fun with sports data which I had never looked at, but then also try to come up with something I would personally use in the future. So, why split games by half? I personally would like to know if I should keep watching a game or not at half time. Having a tool to help me decide would be great, and well, if the team I&amp;#8217;m rooting for has high chances of losing or winning, ideally I would switch to doing something else. A related question that I didn&amp;#8217;t try to answer is which half is worth watching? This would be a meaningful question if you only have time to watch one of them.&lt;/p&gt;
&lt;p&gt;To truly satisfy my goals, it wasn&amp;#8217;t enough to just build a predictive model. That is why I also built a web application using the &lt;code&gt;shiny&lt;/code&gt; package (&lt;span class="showtooltip" title="RStudio and Inc. (2013). _shiny: Web Application Framework for R_.  R package version 0.4.0,   http://CRAN.R-project.org/package=shiny."&gt;&lt;a href="http://CRAN.R-project.org/package=shiny"&gt;RStudio and Inc., 2013&lt;/a&gt;&lt;/span&gt;). It was the first time I did a shiny app, but thanks to the good manual and some examples on GitHub from John Muschelli like his &lt;a href="https://github.com/muschellij2/Shiny_model"&gt;Shiny_model&lt;/a&gt; it wasn&amp;#8217;t so bad. I thus invite you to test and browse my shiny app at &lt;a href="http://glimmer.rstudio.com/lcolladotor/NFLhalf/"&gt;&lt;a href="http://glimmer.rstudio.com/lcolladotor/NFLhalf/"&gt;http://glimmer.rstudio.com/lcolladotor/NFLhalf/&lt;/a&gt;&lt;/a&gt;. It could be improved by adding some functions that scrape live data for the 2013 season so you don&amp;#8217;t have to input all the variables needed by using the sliders. Anyhow, I&amp;#8217;m happy with the result.&lt;/p&gt;
&lt;p&gt;The entire project&amp;#8217;s code, EDA steps, shiny app, and report are available via GitHub in my repository (&lt;span class="showtooltip" title="lcolladotor (2013). lcollado753.   https://github.com/lcolladotor/lcollado753 [Online.  last-accessed: 2013-03-21 02:23:49].   https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half."&gt;&lt;a href="https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half"&gt;lcollado753&lt;/a&gt;&lt;/span&gt;). While the details are in the report, I&amp;#8217;ll give a brief summary here.&lt;/p&gt;
&lt;p&gt;Basically, I summarized the play-by-play data for all NFL games from 2002 to 2012 seasons as provided by Burke (&lt;span class="showtooltip" title="(2010). Advanced NFL Stats: Play-by-Play Data.   http://www.advancednflstats.com/2010/04/play-by-play-data.html  [Online. last-accessed: 2013-03-24 00:08:20].   http://www.advancednflstats.com/2010/04/play-by-play-data.html."&gt;&lt;a href="http://www.advancednflstats.com/2010/04/play-by-play-data.html"&gt;Burke, 2010&lt;/a&gt;&lt;/span&gt;). I used some of the variables Burke uses (&lt;span class="showtooltip" title="(2009). Advanced NFL Stats: How the Model Works-A Detailed  Example Part 1.   http://www.advancednflstats.com/2009/01/how-model-works-detailed-example.html  [Online. last-accessed: 2013-03-24 00:08:21].   http://www.advancednflstats.com/2009/01/how-model-works-detailed-example.html."&gt;&lt;a href="http://www.advancednflstats.com/2009/01/how-model-works-detailed-example.html"&gt;Burke, 2009&lt;/a&gt;&lt;/span&gt;) and some others like the score difference, who starts the second half, and the game day winning percentages of both teams. After exploring the data, I discarded the years 2002 to 2005. Then, I trained a model using the 2006 to 2011 data and did some quick model selection. Note that I&amp;#8217;m not doing the adjustment by opponent the way Burke did it (&lt;span class="showtooltip" title="(2009). Advanced NFL Stats: How the Model Works-A Detailed  Example Part 2.   http://www.advancednflstats.com/2009/01/how-model-works-detailed-example-part-2.html  [Online. last-accessed: 2013-03-24 00:08:23].   http://www.advancednflstats.com/2009/01/how-model-works-detailed-example-part-2.html."&gt;&lt;a href="http://www.advancednflstats.com/2009/01/how-model-works-detailed-example-part-2.html"&gt;Burke, 2009-2&lt;/a&gt;&lt;/span&gt;) in part because I was running out of time, but also because the model already uses the current game winning percentages of both teams to consider the two team&amp;#8217;s strength. I evaluated the model using the 2012 data and after seeing that it worked decently enough, I trained a second model using the data from 2006 to 2012 so it can be used for the 2013 season. These two trained models are the ones available in the shiny app I made.&lt;/p&gt;
&lt;p&gt;In the report, I didn&amp;#8217;t include ROCs—a big miss—so here they go. The code I will show below is heavily based on a post on GLMs (&lt;span class="showtooltip" title="denishaine (2013). Veterinary Epidemiologic Research: GLM  \ Evaluating Logistic Regression Models (part 3).   http://denishaine.wordpress.com/2013/03/19/veterinary-epidemiologic-research-glm-evaluating-logistic-regression-models-part-3/  [Online. last-accessed: 2013-03-23 22:51:49].   http://denishaine.wordpress.com/2013/03/19/veterinary-epidemiologic-research-glm-evaluating-logistic-regression-models-part-3/."&gt;&lt;a href="http://denishaine.wordpress.com/2013/03/19/veterinary-epidemiologic-research-glm-evaluating-logistic-regression-models-part-3/"&gt;denishaine, 2013&lt;/a&gt;&lt;/span&gt;). The code below is written in a way that you can easily reproduce it if you have cloned my repository for the 140.753 class (&lt;span class="showtooltip" title="lcolladotor (2013). lcollado753.   https://github.com/lcolladotor/lcollado753 [Online.  last-accessed: 2013-03-21 02:23:49].   https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half."&gt;&lt;a href="https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half"&gt;lcollado753&lt;/a&gt;&lt;/span&gt;).&lt;/p&gt;
&lt;p&gt;First, some setup steps.&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;## Specify the directory where you cloned the lcollado753 repo
maindir &amp;lt;- "whereYouClonedTheRepo"
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class="r"&gt;## Load packages needed
suppressMessages(library(ROCR))
library(ggplot2)

## Load fits.  Remember that 1st one used data from 2006 to 2011 and the
## 2nd one used data from 2006 to 2012.
load(paste0(maindir, "/lcollado753/final/nfl_half/EDA/model/fits.Rdata"))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, I make the ROCs for both trained models using the data that they were trained on. They should be quite good since it uses the same data to build the model that it will then try to predict.&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;## Make the ROC plots

## Simple list where I'll store all the results so I can compare the ROC
## plots later on
all &amp;lt;- list()

## Construct prediction function
for (i in 1:2) {
    ## Predict on the original data
    pred &amp;lt;- predict(fits[[i]])

    ## Subset original data (remove NA's)
    data &amp;lt;- fits[[i]]$data
    data &amp;lt;- data[complete.cases(data), ]

    ## Construct prediction function
    pred.fn &amp;lt;- prediction(pred, data$win)

    ## Get performance info
    perform &amp;lt;- performance(pred.fn, "tpr", "fpr")

    ## Get ready to plot
    toPlot &amp;lt;- data.frame(tpr = unlist(slot(perform, "y.values")), fpr = unlist(slot(perform, 
        "x.values")))
    all &amp;lt;- c(all, list(toPlot))

    ## Make the plot
    res &amp;lt;- ggplot(toPlot) + geom_line(aes(x = fpr, y = tpr)) + geom_abline(intercept = 0, 
        slope = 1, colour = "orange") + ylab("Sensitivity") + xlab("1 - Specificity") + 
        ggtitle(paste("Years 2006 to", c("2011", "2012")[i]))
    print(res)

    ## Print the AUC value
    print(unlist(performance(pred.fn, "auc")@y.values))
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="plot of chunk ROC" src="http://i.imgur.com/b1FS2ml.png"/&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## [1] 0.8506
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="plot of chunk ROC" src="http://i.imgur.com/f2UOySy.png"/&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## [1] 0.8513
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Both ROC plots look pretty similar (well, the data sets are very similar!) and have relatively high AUC values.&lt;/p&gt;
&lt;p&gt;Next, I make the ROC plot using the model trained with the data from 2006 to 2011 to predict the outcomes for the 2012 games.&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;## Load 2012 data
load(paste0(maindir, "/lcollado753/final/nfl_half/data/pred/info2012.Rdata"))

## Predict using model fit with data from 2006 to 2011
pred &amp;lt;- predict(fits[[1]], info2012)

## Construction prediction function
pred.fn &amp;lt;- prediction(pred, info2012$win)

## Get performance info
perform &amp;lt;- performance(pred.fn, "tpr", "fpr")

## Get ready to plot
toPlot &amp;lt;- data.frame(tpr = unlist(slot(perform, "y.values")), fpr = unlist(slot(perform, 
    "x.values")))
all &amp;lt;- c(all, list(toPlot))

## Make the plot
ggplot(toPlot) + geom_line(aes(x = fpr, y = tpr)) + geom_abline(intercept = 0, 
    slope = 1, colour = "orange") + ylab("Sensitivity") + xlab("1 - Specificity") + 
    ggtitle("Model trained 2006-2011 predicting 2012")
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="plot of chunk pred2012" src="http://i.imgur.com/DDcsW7W.png"/&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;
## Print the AUC value
print(unlist(performance(pred.fn, "auc")@y.values))
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 0.816
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The steps in the curve are more visible since it is using less data. It also seems to be a little less good than the other two, as expected. This is clear when comparing the AUC values.&lt;/p&gt;
&lt;p&gt;Finally, I plot all curves in the same picture to visually compare them.&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;names(all) &amp;lt;- c("train2011", "train2012", "pred2012")
for (i in 1:3) {
    all[[i]] &amp;lt;- cbind(all[[i]], rep(names(all)[i], nrow(all[[i]])))
    colnames(all[[i]])[3] &amp;lt;- "set"
}
all &amp;lt;- do.call(rbind, all)

ggplot(all) + geom_line(aes(x = fpr, y = tpr, colour = set)) + geom_abline(intercept = 0, 
    slope = 1, colour = "orange") + ylab("Sensitivity") + xlab("1 - Specificity") + 
    ggtitle("Comparing ROCs")
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="plot of chunk allInOne" src="http://i.imgur.com/tUVfgfs.png"/&gt;&lt;/p&gt;
&lt;p&gt;Both ROCs with the trained data (train2011, train2012) are nearly identical and both are slightly superior to the one predicting the 2012 games.&lt;/p&gt;
&lt;p&gt;Overall I am happy with the results and while some things can certainly be improved, I look forward to the NFL 2013 season. Also, remember that Burke publishes his winning estimated probabilities from week 4 onward (&lt;span class="showtooltip" title="BURKE BB (2013). Brian Burke - The Fifth Down Blog -  NYTimes.com.   http://fifthdown.blogs.nytimes.com/author/brian-burke/ [Online.  last-accessed: 2013-03-24 00:26:32].   http://fifthdown.blogs.nytimes.com/author/brian-burke/."&gt;&lt;a href="http://fifthdown.blogs.nytimes.com/author/brian-burke/"&gt;The Fifth Down Blog&lt;/a&gt;&lt;/span&gt;). So you might be interested on comparing the probability at half time versus his estimated probability which is calculated before the game starts. I mean, maybe you could use the difference between the two to have an idea of how unexpected the first half was. After all, if a game falls outside the pattern it might be worth watching.&lt;/p&gt;
&lt;p&gt;Citations made with &lt;code&gt;knitcitations&lt;/code&gt; (&lt;span class="showtooltip" title="Boettiger C (2013). _knitcitations: Citations for knitr markdown  files_. R package version 0.4-4,   https://github.com/cboettig/knitcitations."&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;Boettiger, 2013&lt;/a&gt;&lt;/span&gt;).&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;lcolladotor, lcollado753. &lt;em&gt;GitHub&lt;/em&gt; &lt;a href="https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half"&gt;&lt;a href="https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half"&gt;https://github.com/lcolladotor/lcollado753/tree/master/final/nfl_half&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;denishaine, (2013) Veterinary Epidemiologic Research: GLM &amp;amp;ndash; Evaluating Logistic Regression Models (part 3). &lt;em&gt;denis haine&lt;/em&gt; &lt;a href="http://denishaine.wordpress.com/2013/03/19/veterinary-epidemiologic-research-glm-evaluating-logistic-regression-models-part-3/"&gt;&lt;a href="http://denishaine.wordpress.com/2013/03/19/veterinary-epidemiologic-research-glm-evaluating-logistic-regression-models-part-3/"&gt;http://denishaine.wordpress.com/2013/03/19/veterinary-epidemiologic-research-glm-evaluating-logistic-regression-models-part-3/&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Advanced NFL Stats. &lt;a href="http://www.advancednflstats.com/"&gt;&lt;a href="http://www.advancednflstats.com/"&gt;http://www.advancednflstats.com/&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;(2010) Advanced NFL Stats: Play-by-Play Data. &lt;a href="http://www.advancednflstats.com/2010/04/play-by-play-data.html"&gt;&lt;a href="http://www.advancednflstats.com/2010/04/play-by-play-data.html"&gt;http://www.advancednflstats.com/2010/04/play-by-play-data.html&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;(2009) Advanced NFL Stats: How the Model Works–A Detailed Example Part 1. &lt;a href="http://www.advancednflstats.com/2009/01/how-model-works-detailed-example.html"&gt;&lt;a href="http://www.advancednflstats.com/2009/01/how-model-works-detailed-example.html"&gt;http://www.advancednflstats.com/2009/01/how-model-works-detailed-example.html&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;(2009) Advanced NFL Stats: How the Model Works–A Detailed Example Part 2. &lt;a href="http://www.advancednflstats.com/2009/01/how-model-works-detailed-example-part-2.html"&gt;&lt;a href="http://www.advancednflstats.com/2009/01/how-model-works-detailed-example-part-2.html"&gt;http://www.advancednflstats.com/2009/01/how-model-works-detailed-example-part-2.html&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;By BURKE, Brian Burke - The Fifth Down Blog - NYTimes.com. &lt;em&gt;The Fifth Down Â» Brian Burke&lt;/em&gt; &lt;a href="http://fifthdown.blogs.nytimes.com/author/brian-burke/"&gt;&lt;a href="http://fifthdown.blogs.nytimes.com/author/brian-burke/"&gt;http://fifthdown.blogs.nytimes.com/author/brian-burke/&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Carl Boettiger, knitcitations: Citations for knitr markdown files. &lt;a href="https://github.com/cboettig/knitcitations"&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;https://github.com/cboettig/knitcitations&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;RStudio , Inc. , (2013) shiny: Web Application Framework for R. &lt;a href="http://CRAN.R-project.org/package=shiny"&gt;&lt;a href="http://CRAN.R-project.org/package=shiny"&gt;http://CRAN.R-project.org/package=shiny&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://fellgernon.tumblr.com/post/46117939292</link><guid>http://fellgernon.tumblr.com/post/46117939292</guid><pubDate>Sat, 23 Mar 2013 20:46:00 -0400</pubDate><category>NFL</category><category>jhsph753</category><category>Adv_NFL_Stats</category><category>rstats</category><category>Prediction</category><category>R</category></item><item><title>"I am a writer" exercise</title><description>&lt;p&gt;I do not have a clear memory of when I started to write or in which language it was. My first written words might have been in English since I lived in Boston (USA) three years during my early childhood. By age five I was back in Mexico and that is where I am sure I wrote my first full homeworks. During elementary school, I changed languages once more—this time to French. By middle school, I started to be interested in two new types of languages. One was mathematics which I liked, but which I didn&amp;#8217;t consider till much later. The other was related to computers as I learnt the very basics of HTML—that&amp;#8217;s all I know so far. In college—having reverted back to Spanish and English—and in my current stage in graduate school, I am a writer because I write—I mainly typeset using &lt;code&gt;LaTeX&lt;/code&gt;—my homeworks, code in &lt;code&gt;R&lt;/code&gt;, and summarize findings in reports. For the past year, I have been using &lt;a href="http://fellgernon.tumblr.com/"&gt;Fellgernon Bit&lt;/a&gt; to practice writing and hopefully improve my skills. Furthermore, for me the process of writing helps me clarify my thoughts and organize them before attempting to communicate them. Sometimes it works, others it doesn&amp;#8217;t. Finally, I am a writer because it is crucial in the academic environment to be able to communicate through the printed word. This is tricky because sometimes you want to be very short, direct but not leave anything important out, like when emailing a professor. Other times, you have to be very precise and clear yet tell an interesting story such as when writing a scientific report.&lt;/p&gt;
&lt;p&gt;Overall, I consider myself a writer in training and would like to improve. But as with everything, practice is key. That&amp;#8217;s a big part of why I blog and why I&amp;#8217;m enrolled in this course.&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/45894450462</link><guid>http://fellgernon.tumblr.com/post/45894450462</guid><pubDate>Thu, 21 Mar 2013 00:42:00 -0400</pubDate><category>courseraengcomp</category></item><item><title>And so begins English Composition I</title><description>&lt;p&gt;This week started the English Composition I: Achieving Expertise course (&lt;span class="showtooltip" title="(2013). Coursera.  https://www.coursera.org/ [Online.  last-accessed: 2013-03-21 03:47:13].   https://www.coursera.org/course/composition."&gt;&lt;a href="https://www.coursera.org/course/composition"&gt;Comer, 2013&lt;/a&gt;&lt;/span&gt;) that I have been looking forward to.&lt;/p&gt;
&lt;p&gt;I am not sure yet how long I will last, but I hope to enjoy it as much as I can. Plus, it should help me with my posting and other writing areas. While I last in the course, I plan to publish my writings in the blog too. So you will hopefully see me be more active here.&lt;/p&gt;
&lt;p&gt;As it is important to cite when writing, I have also figured out how to do so automatically in Rmd files. For that I learnt how to use &lt;strong&gt;knitcitations&lt;/strong&gt; from the GitHub instructions (&lt;span class="showtooltip" title="cboettig (2013). knitcitations.   https://github.com/cboettig/knitcitations [Online. last-accessed:  2013-03-21 03:19:44].   https://github.com/cboettig/knitcitations."&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;knitcitations&lt;/a&gt;&lt;/span&gt;) and a explanatory post (&lt;span class="showtooltip" title="Boettiger C (2013). knitcitations.   http://www.carlboettiger.info/2012/05/30/knitcitations.html  [Online. last-accessed: 2013-03-21 02:15:41].   http://www.carlboettiger.info/2012/05/30/knitcitations.html."&gt;&lt;a href="http://www.carlboettiger.info/2012/05/30/knitcitations.html"&gt;Boettiger, 2013&lt;/a&gt;&lt;/span&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;knitcitations&lt;/strong&gt; is great, but it kind of struggles with some pages. That is why I modified my template in &lt;a href="https://github.com/lcolladotor/FBit"&gt;FBit&lt;/a&gt; by writing my own citing function for pages where &lt;code&gt;citep&lt;/code&gt; fails. Here is the code:&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;## I made my own citing function since citep() doesn't work like I want to
## with urls that are not really pages themselve like part of a GitHub
## repo.
mycitep &amp;lt;- function(x, short, year = substr(date(), 21, 24), tooltip = TRUE) {
    tmp &amp;lt;- citep(x)
    res &amp;lt;- gsub("&amp;gt;&amp;lt;/a&amp;gt;", paste0("&amp;gt;", short, "&amp;lt;/a&amp;gt;"), tmp)
    if (tooltip) {
        res &amp;lt;- gsub("\\?\\?\\?\\?", year, res)
    }
    res
}

## You already saw an inline working example in the post itself.
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;&lt;li&gt;Carl Boettiger, (2013) knitcitations. &lt;em&gt;Lab Notebook&lt;/em&gt; &lt;a href="http://www.carlboettiger.info/2012/05/30/knitcitations.html"&gt;&lt;a href="http://www.carlboettiger.info/2012/05/30/knitcitations.html"&gt;http://www.carlboettiger.info/2012/05/30/knitcitations.html&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;cboettig, knitcitations. &lt;em&gt;GitHub&lt;/em&gt; &lt;a href="https://github.com/cboettig/knitcitations"&gt;&lt;a href="https://github.com/cboettig/knitcitations"&gt;https://github.com/cboettig/knitcitations&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Coursera. &lt;em&gt;Coursera&lt;/em&gt; &lt;a href="https://www.coursera.org/course/composition"&gt;&lt;a href="https://www.coursera.org/course/composition"&gt;https://www.coursera.org/course/composition&lt;/a&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://fellgernon.tumblr.com/post/45892065847</link><guid>http://fellgernon.tumblr.com/post/45892065847</guid><pubDate>Thu, 21 Mar 2013 00:02:00 -0400</pubDate><category>Coursera</category><category>courseraengcomp</category></item><item><title>FBit: GitHub repo for posts with R code for this blog</title><description>&lt;p&gt;This is a test post since I want to improve upon Jeffrey Horner&amp;#8217;s &lt;a href="http://jeffreyhorner.tumblr.com/post/25943954723/blog-with-r-markdown-and-tumblr-part-ii"&gt;strategy for posting R code in Tumblr&lt;/a&gt;. The only minor improvement I wanted to try out is hosting the images directly on the web. I mean, right now the images won&amp;#8217;t show in RSS readers. I&amp;#8217;m not doing anything new at all, just using the imgur_upload function in &lt;a href="http://yihui.name/knitr/"&gt;knitr&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This is part of my plan to write paper posts. I already created the GitHub repo &lt;a href="https://github.com/lcolladotor/FBit"&gt;FBit&lt;/a&gt; which should host any future posts I make with knitr.&lt;/p&gt;
&lt;p&gt;For now, I&amp;#8217;m testing the post template from &lt;a href="https://github.com/lcolladotor/FBit/blob/master/R-post-template/R-post-template.Rmd"&gt;FBit template&lt;/a&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class="r"&gt;library(ggplot2)
qplot(hp, mpg, data = mtcars) + geom_smooth()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="plot of chunk carplot" src="http://i.imgur.com/zfg0Gih.png"/&gt;&lt;/p&gt;
&lt;p&gt;You can also visualize the test &lt;a href="http://htmlpreview.github.com/?https://github.com/lcolladotor/FBit/blob/master/test-template/test-template.html"&gt;here&lt;/a&gt;&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/45164372110</link><guid>http://fellgernon.tumblr.com/post/45164372110</guid><pubDate>Mon, 11 Mar 2013 23:20:22 -0400</pubDate><category>knitr</category></item><item><title>Commenting scientific papers</title><description>&lt;p&gt;I&amp;#8217;ve been thinking about commenting papers in blog posts. I did a &lt;a href="http://fellgernon.tumblr.com/tagged/Paper%20comments#.UT534NHF0W8"&gt;few some long time ago&lt;/a&gt;, but now I&amp;#8217;m thinking of doing this activity more systematically. There are several reasons why I&amp;#8217;m thinking of doing this, say for 1 paper a week. It has the obvious advantage of forcing me to read a paper in depth per week. At the same time, I want to learn more from others. See what I like in other papers and maybe avoid some mistakes. There are two main lines of papers that I would be posting about. Anything that is somewhat close to my research (genomics, RNA-seq, biostatistics, bioconductor, visualization) and anything done by my undergrad peers from &lt;a href="http://www.lcg.unam.mx/"&gt;LCG-UNAM&lt;/a&gt;. I don&amp;#8217;t think that there is a compilation of papers from LCG students despite many of us doing research all over the globe —Mexico, US, Canada, Denmark, France, England, Germany, Switzerland, Austria, Australia to name a few countries. Maybe compiling a list of papers with contributions from LCG students is a task for &lt;a href="http://masciencia.org/"&gt;Más Ciencia por México&lt;/a&gt; which seeks to promote science in Mexico. But I would be happy to learn what others are doing and in a way keep in touch academically. &lt;/p&gt;
&lt;p&gt;Another reason in favor is that blogging helps me practice my English. And writing helps me organize my ideas.&lt;/p&gt;
&lt;p&gt;But, the question remains, if you systematically comment papers, what would you comment on?&lt;/p&gt;
&lt;p&gt;I think that I should state my opinion of the paper in different areas. Kind of like doing a review. First, try to summarize the paper. Next, was the scientific objective clear? They did answer the main question? Then, given the nature of my Ph.D. program, &lt;span&gt;I think that I should try to comment on any statistics used in the papers. This certainly includes the plots and reproducibility. If they included tools (software), I could take a quick look at it. Then, I can end with stating the main things I liked. Maybe I could come up with some scoring mechanism to rate the paper.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;You can think of other aspects to talk about of a paper. For example, in what way did it help it&amp;#8217;s field? But, I don&amp;#8217;t think that I can answer this for many papers outside my research area. It would all be speculation. I guess that I could use Google Scholar to see who cited the paper and maybe comment on it&amp;#8217;s impact that way. For the LCG papers, I could point out how the LCG students contributed. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Or maybe I could take the more educational route. But that&amp;#8217;s very time consuming as I can see from &lt;a href="http://cienciaexplicada.blogspot.com/"&gt;La Ciencia explicada&lt;/a&gt;&amp;#8217;s highly detailed posts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Anyhow, if I have something clear in mind is how I would implement it. I&amp;#8217;m thinking of making a GitHub repository and writing my comments using Rmd and knitr. Then posting them here using Markdown. It should be easy to then have a template post and fill in the gaps after reading the paper.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;The risk of using a template is that the comments will start to look boring. That&amp;#8217;s why I might add a more free section, or change things up a bit.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;If you have any ideas, let me know!&lt;/span&gt;&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/45152484977</link><guid>http://fellgernon.tumblr.com/post/45152484977</guid><pubDate>Mon, 11 Mar 2013 20:59:11 -0400</pubDate><category>Paper comments</category><category>LCG</category></item><item><title>Analyzing SimplyStatistics visits info</title><description>&lt;p&gt;Recently we had to analyze the data of the number of visits per day to &lt;a href="http://simplystatistics.org/"&gt;SimplyStatistics.org&lt;/a&gt;. There were two goals:&lt;/p&gt;
&lt;ol&gt;&lt;li&gt;Estimate the fraction of visitors retained after a spike in the number of visitors&lt;/li&gt;
&lt;li&gt;Identify (if any) any factors that influence the fraction estimated in 1.&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;For me it was a fun project in part because I like SimplyStatistics but also because I think that finding the answers to the questions would be interesting and help understand the readers of that blog.&lt;/p&gt;
&lt;p&gt;Sadly, I didn&amp;#8217;t work on it much. We had lots of stuff due that week, but well, I&amp;#8217;m happy enough with the analysis I did. My own report is hosted &lt;a href="https://github.com/lcolladotor/lcollado753/tree/master/hw/data-analysis-02"&gt;here&lt;/a&gt; and &lt;a href="https://github.com/lcolladotor/lcollado753/blob/master/hw/data-analysis-02/report/data_02_lcollado.pdf" target="_blank"&gt;this is the pdf file of the report itself&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Half joking with other students, I said that I basically did t-tests. Hopefully I can work on changing this tendency with the pile of recommended books I&amp;#8217;ve been acquiring but not really reading through. Except for the &lt;a href="http://bit.ly/13MyHwt"&gt;ggplot2: Elegant Graphics for Data Analysis&lt;/a&gt; and the &lt;a href="http://oreil.ly/Yk8xtl"&gt;R Graphics Cookbook&lt;/a&gt;. Sounds like spring break will be fun :P&lt;/p&gt;

&lt;p&gt;Kind of related to this, &lt;a href="http://bit.ly/13MypWw"&gt;Jeff Leek announced yesterday that he is going to  compile a list of student blogs that have something to do with statistics and data&lt;/a&gt;. He added a link to my blog which is why I saw a large peak of Fellgernon Bit&amp;#8217;s visitor data. After all, when doing the data analysis described above I played around with the data from Fellgernon Bit and now know that at a minimum posting drives visitor&amp;#8217;s into sites (which sounds obvious, but maybe you get random traffic) —see &lt;a href="https://github.com/lcolladotor/lcollado753/blob/master/hw/data-analysis-02/report/data_02_lcollado.pdf"&gt;fig 1 of the report&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="image" src="http://media.tumblr.com/f5ce3511fb8d6899a613e348a846dcc8/tumblr_inline_mjf4iavs4A1qz4rgp.png"/&gt;&lt;/p&gt;

&lt;p&gt;Had Jeff done so before, I could have a point estimate (but without being able to say something about the uncertainty of it) that SimplyStatistics has 142 visitors that read the posts AND click on the links. Maybe using the info from &lt;a href="http://bit.ly/12vVmbp"&gt;Hilary&amp;#8217;s&lt;/a&gt; and &lt;a href="http://bit.ly/13MyyZS"&gt;Alyssa&amp;#8217;s&lt;/a&gt; blogs we could have an estimate with some measure of uncertainty, but only for March 8th.&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/44980154458</link><guid>http://fellgernon.tumblr.com/post/44980154458</guid><pubDate>Sat, 09 Mar 2013 19:19:00 -0500</pubDate><category>DataAnalysis</category><category>Blog</category><category>@simplystats</category><category>jhsph753</category></item><item><title>Alfred: a must for any Mac user</title><description>&lt;p&gt;At the beginning of the semester, I decided to go hunting for Mac apps that would help me be more organizing and/or enjoy my Mac even more. After all, I was using the basics –with multiple spaces– and had only customized my favorite editors. &lt;/p&gt;
&lt;p&gt;It turns out that &lt;a href="http://bit.ly/12w0M6m"&gt;Alfred&lt;/a&gt; is an excellent app. The free version can get you a lot of mileage and save you lots of time by typing alt + space, then entering  the keyword you want to search in Google. Or alt + space, image, then the  query for Google Images. Or alt + space, find, the parts of the name of a file you want. Or alt + space, in, something you want to find inside a file. I love the alt + space, define, something I want to find in the dictionary. I know that dictionaries are just around the corner [a bookmark away!] but still, thanks to Alfred I now look up words WAY more frequently than what I did before. I mean, just not having to move my hands away from my keyboard and doing a ton of stuff is just great =)&lt;/p&gt;
&lt;p&gt;There are plenty of other default searches that come with Alfred&amp;#8217;s free version. Another thing that I love is using it to do system commands like lock the screen, or send my computer to sleep.&lt;/p&gt;
&lt;p&gt;Try it out!&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/43153620793</link><guid>http://fellgernon.tumblr.com/post/43153620793</guid><pubDate>Fri, 15 Feb 2013 11:30:29 -0500</pubDate><category>Mac</category></item><item><title>Second cultural mixer today!</title><description>&lt;img src="http://25.media.tumblr.com/f55efedc3ca5189bcc1600941f7ef56c/tumblr_mi58wrDjuB1qgn8kjo1_500.png"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Second cultural mixer today!&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/43149299787</link><guid>http://fellgernon.tumblr.com/post/43149299787</guid><pubDate>Fri, 15 Feb 2013 10:00:19 -0500</pubDate><category>Biostatistics</category><category>Social</category><category>Cultural</category></item><item><title>Liking "Inbox Zero for Life"</title><description>&lt;p&gt;I&amp;#8217;ve been using the &amp;#8220;&lt;a href="http://bit.ly/12vYvIh"&gt;Inbox Zero for Life&lt;/a&gt;&amp;#8221; strategy for a few weeks, and I think that it&amp;#8217;s been payed off for me in this short span.&lt;/p&gt;
&lt;p&gt;As it&amp;#8217;s stated in that long guide, one of the major concerns you might have is that it could end up as just changing a current problem for another one. I think that so far, that hasn&amp;#8217;t been the case for me. Sure, my starred emails is not 0, but it stays at a steady number and doesn&amp;#8217;t increase as my inbox (even with priority inbox) did. &lt;/p&gt;
&lt;p&gt;I also have a few filters in place that pick up the emails that I will most likely never read. For example, advertising emails and school wide announcements.&lt;/p&gt;
&lt;p&gt;One point I&amp;#8217;m not sure that I buy is the whole psychological effect of having an empty inbox. But whether or not that&amp;#8217;s true, I certainly didn&amp;#8217;t know that the gmail app in the iPhone/iPad has a weird smiley that looks like a sun telling you something like: &amp;#8220;Your inbox is empty. Have a nice day!&amp;#8221;&lt;/p&gt;
&lt;p&gt;Credit goes to &lt;a href="http://bit.ly/12vVmbp"&gt;Hilary&lt;/a&gt; for finding telling us about this strategy.&lt;/p&gt;</description><link>http://fellgernon.tumblr.com/post/43081499214</link><guid>http://fellgernon.tumblr.com/post/43081499214</guid><pubDate>Thu, 14 Feb 2013 11:30:36 -0500</pubDate><category>Gmail</category></item></channel></rss>
