Data @ Reed

Why Use R? 

R is difficult to learn at first, just like any language is, so you might wonder why you should learn it instead of using Excel or Google Sheets. But the things you can do with R blow those programs away, and learning R will let you do a broader range of analyses, make your graphs look prettier, and save you time in the long run. It is a steep learning curve at first, but well worth it once you realize you can rerun an analysis on a new set of data and produce a beautiful graph in less than five seconds. 

Here are some of R’s main advantages:

Being open source promotes community.

R is open-source software, which means it is freely available for everyone to use and modify. R and Posit (the company behind R) have a vibrant and supportive community, and this means that there is bountiful knowledge to be found on the web about how to use R. If you have a problem, chances are someone else has too, and you can easily Google a solution. The open source nature also means that if you love R, you write your own packages for it and be a contributor!

It is powerful, flexible, and highly customizable to any specific field.

R is designed with statistical analysis in mind, making it one of the most powerful tools for data science. Its extensive library of packages covers nearly every statistical technique, from simple regressions to sophisticated machine learning algorithms. But unlike more rigid software like Excel or Google Sheets, R enables you to tailor your analysis to meet specific needs. Because anyone can write packages for R, there are packages that perform highly specialized tasks. Are you an evolutionary biologist who needs to make a phylogenetic tree? R can do that! Are you an economist who needs to pull data from the 1980 census for the state of Oregon separated by county? R can do that directly, no need to download spreadsheets! Are you an epidemiologist who wants to model and then map disease outbreaks? There’s a package for that! 

Code is reusable and runs quickly, which saves you time.

With coding, you often don’t have to start from a blank page.  It’s not the same as writing an essay and starting from scratch each time. You start from example code that does something similar to what you want, then you build new scripts off your old scripts. If you want to run the same analysis but on a new set of data, all you have to do is change the file name and then press “run” and you’ve done your entire analysis. Unlike a point-and-click program like Excel, R’s scripting capabilities mean that complex workflows are automated, which makes them fast and reduces the potential for human error. And sometimes big tasks in other programs only take a single line of code in R, which you’ll love if you’ve been running ANOVAs in JMP. Having all your steps written out in a script also allows for reproducible research, meaning you can easily share your code and results with others and they can be sure that they follow exactly what you did without missing any steps. 

It makes prettier graphs. You can get everything to look exactly how you want.

R is not just about numbers and code—it's also about creativity. R gives you full control over the appearance of your visualizations, from the layout and colors to the fonts and styles. There are many packages that are devoted just to getting your color palettes to be precisely the right gradient or to allow your tick marks on the axes to be just the height and width you would like. If you are someone who spends as much time picking the colors for your graph as you did making the graph itself, you will love the level of customization you can do in R. To get you started,  here’s a nice gallery of R plots and a package someone created that has all the Wes Anderson movies as their own color palettes (scroll down on site to see).