Showing posts with label maps. Show all posts
Showing posts with label maps. Show all posts

Monday, March 30, 2015

Simple Maps with ggplot2

As part of a case competition I recently participated in, our team was struggling to put together a convincing argument for replacing NIH review sessions (a peer-review system for dispensing NIH research funds) with randomly-assigned grants. With only hours left before the deadline, we needed a quick way to display geographical data demonstrating uneven (*cough* biased) grant distribution under the current system. Without making a case for that idea (we did not win the case competition :), I did come across a simple method in R that uses the ggplot2 and maps toolboxes. I was able to go from zero to map in under thirty minutes. I was impressed, and you may find this useful next time you try to stick it to the man.

The R script and data can be downloaded from our bitbucket repository.

Step one, read in your data:

I pulled some data on recent NIH grants (all sizes), state populations, and the number of universities per state. I compiled this information into a file "nih_funding.txt", with an added column for the amount of NIH funding per individual in each state, and the amount of NIH funding per university in each state.

 
 library(ggplot2)  
 nih_data = read.table('nih_funding.txt',header=T,sep='\t')  
 nih_data$LOCATION = tolower(nih_data$LOCATION)  


Step two, plot your data:

First, plot NIH funding per university:

 
 # NIH.Funding.per.institution  
 states_map <- map_data("state")  
 m = ggplot(nih_data, aes(map_id = LOCATION)) +   
   geom_map(aes(fill = NIH.Funding.per.institution ), map = states_map) +   
   expand_limits(x = states_map$long, y = states_map$lat) +  
   theme_bw() +   
   theme(axis.title = element_blank(), axis.text=element_blank()) +  
   ggtitle("NIH Funding per Institution by State")  
 print(m)   
 ggsave(m, file="NIH_funding_by_institution.jpg", width=8, height=4)  
Because we're using ggplot2, the image is constructed layer by layer. First, a ggplot object is created,  a "geom_map()" layer is added. In this case, the map is chosen to be a map of the United States (a built-in option). The "theme_bw()" function removes the gray background. The "theme()" function removes the axis labels. "ggtitle()"--this may come as a surprise--adds a title to your image.

Run this, and your map should come out looking like this:



Then plot NIH funding per person:

 
 # NIH.Funding.per.person  
 states_map <- map_data("state")  
 m = ggplot(nih_data, aes(map_id = LOCATION)) +   
  geom_map(aes(fill = NIH.Funding.per.person ), map = states_map) +   
  expand_limits(x = states_map$long, y = states_map$lat) +  
  theme_bw() +   
  theme(axis.title = element_blank(), axis.text=element_blank()) +  
  ggtitle("NIH Funding per Person by State")  
 print(m)   
 ggsave(m, file="NIH_funding_by_population.jpg", width=8, height=4)  

Notice how some states seem to receive greater-than-average federal research money for the population size and the number of universities. This is not an in-depth analysis, so there may be good reasons for this apparent discrepancy. The real takeaway here is that in just two steps you find yourself staring at a beautiful map. Not bad for a day's work. 

Monday, March 16, 2015

Getting on the Map

Open source mapping software makes it much easier to produce beautiful, custom maps. An article on custom mapping was recently published in Nature, which reviews many mapping resources, from specialized tools such as MapBox and Google Maps API to more conventional scripting languages with mapping capability such as R and Python.