Mapping with ggplot2

Objective

There are many different things that require scientists to use programming languages (like R). Far too many to count here. There is however one common use amongst almost all environmental scientists: mapping. Almost every report, research project or paper will have need to refer to a study area. This is almost always “Figure 1”. To this end, whenever I teach R, or run workshops on it, one of the questions I am always prepared for is how to create a map of a particular area. Being a happy convert to the tidyverse I only teach the graphics of ggplot2. I have found that people often prefer to use the ggmap extension to create ggplot quality figures with Google map backgrounds, but I personally think that a more traditional monotone background for maps looks more professional. What I’ve decided to showcase this week is the data and code required to create a publication quality map. Indeed, the following code will create the aforementioned obligatory “Figure 1” in a paper I am currently preparing for submission.

Data

There are heaps of packages etc. that one may use to create maps. And there is a never ending source of blogs, books and tutorials that illustrate many of the different ways to visualise spatial data. For my international and geographic borders I prefer to use data I’ve downloaded from GSHHSG and then converted to dataframes using functions found in the PBSmapping package. I then save these converted dataframes as .Rdata objects on my computer for ease of use with all of my projects. For the domestic borders of a country, which I won’t use in this post, one may go here. Note however that for some strange reason this website still has the pre-1994 borders for South Africa. For the correct SA borders one must go here. The current SA borders may actually be download in the .Rdata format, which is neat.

Once one has the borders to be used in the map, the next step is to think about what one actually wants to show. The main purpose of this map is to show where several in situ coastal seawater temperature time series were collected. This could be done quite simply but a plain black and white map is offensively boring so we want to make sure there is a good amount of (but not too much!) colour in order to entice the reader. I personally find pictures of meso-scale oceanic phenomena particularly beautiful so try to include them whenever I can. Luckily that is also what I study so it is not strange that I include such things in my work. Now if only I studied panda’s, too…

Panda’s aside, the current work I am engaged in also requires that the atmospheric processes around southern Africa be considered in addition to the oceanography. To visualise both air and sea concurrently would be a mess so we will want to create separate panels for each. Because I have been working with reanalysis data lately, and not satellite data, I am also able to include the wind/ current vectors in order to really help the temperature patterns pop. The oceanic data are from the BRAN2016 product and the atmospheric data are from ERA-Interim. Both of which are available for download for free for scientific pursuits. I’ve chosen here to use the mean values for January 1st as the summer months provide the most clear example of the thermal differences between the Agulhas and Benguela currents. The code used to create the scale bar in the maps may be found here. It’s not a proper ggplot geom function but works well enough. I’ve also decided to add the 200 m isobath to the sea panel. These data come from NOAA.

## Libraries
library(tidyverse)
library(viridis)
library(gridExtra)

## Data
# South Africa map data
load("../../static/data/southern_africa_coast.Rdata") # Lowres
names(southern_africa_coast)[1] <- "lon"
load("../../static/data/sa_shore.Rdata") # Hires
names(sa_shore)[4:5] <- c("lon","lat")

# International borders
load("../../static/data/africa_borders.Rdata")

# Reanalysis data
load("../../static/data/all_jan1_0.5.Rdata")
names(all_jan1_0.5)[1:2] <- c("lon","lat")

# In situ time series locations
site_list <- read_csv("../../static/data/mg_site_list.csv")
site_list$order <- 1:nrow(site_list)

# Bathymetry data
load("../../static/data/sa_bathy.Rdata")

## Scale bar function
source("../../static/func/scale.bar.func.R")

Mapping

I find that it is easier to keep track of the different aspects of a map when they are stored as different dataframes. One should however avoid having too many loose dataframes running about in the global environment. It is a balancing act and requires one to find a happy middle ground. Here I am going to cut the all_jan1_0.5 dataframe into 4. One each for air and sea temperatures and vectors. I am also going to reduce the resolution of the wind so that the vectors will plot more nicely.

# Devide the reanalysis data
sea_temp <- filter(all_jan1_0.5, variable == "BRAN/temp")
air_temp <- filter(all_jan1_0.5, variable == "ERA/temp")
currents <- filter(all_jan1_0.5, variable == "BRAN/u" | variable == "BRAN/v") %>% 
  select(-date, -index) %>% 
  spread(key = variable, value = value) %>% 
  rename(u = "BRAN/u", v = "BRAN/v")
winds <- filter(all_jan1_0.5, variable == "ERA/u" | variable == "ERA/v") %>% 
  select(-date, -index) %>% 
  spread(key = variable, value = value) %>% 
  rename(u = "ERA/u", v = "ERA/v")

# Reduce wind/ current vectors
lon_sub <- seq(10, 40, by = 1)
lat_sub <- seq(-40, -15, by = 1)
# currents <- currents[(currents$lon %in% lon_sub & currents$lat %in% lat_sub),]
winds <- winds[(winds$lon %in% lon_sub & winds$lat %in% lat_sub),]

With just a few alterations to our nicely divided up dataframes we are ready to create a map. We will look at the code required to create each map and then put it all together in the end.

First up is the most busy. The following code chunk will create the top panel of our map, the sea state. It is necessary to label all of the locations mentioned in the text and so they are thrown on here. In order to make the site label easier to read I’ve made them red. This is particularly jarring but I think I like it.

# Establish the vector scalar for the currents
current_uv_scalar <- 2

# The top figure (sea)
mg_top <- ggplot(data = southern_africa_coast, aes(x = lon, y = lat)) +
  # The ocean temperature
    geom_raster(data = sea_temp, aes(fill = value)) +
  # The bathymetry
    stat_contour(data = sa_bathy[sa_bathy$depth < -200 & sa_bathy$depth > -2000,], 
                 aes(x = lon, y = lat, z = depth, alpha = ..level..),
                 colour = "ivory", size = 0.5, binwidth = 1000, na.rm = TRUE, show.legend = FALSE) +
  # The current vectors
    geom_segment(data = currents, aes(xend = lon + u * current_uv_scalar, yend = lat + v * current_uv_scalar),
                 arrow = arrow(angle = 15, length = unit(0.02, "inches"), type = "closed"), alpha = 0.4) +
  # The land mass
    geom_polygon(aes(group = group), fill = "grey70", colour = "black", size = 0.5, show.legend = FALSE) +
    geom_path(data = africa_borders, aes(group = group)) +
  # The legend for the vector length
    geom_label(aes(x = 36, y = -37, label = "1.0 m/s\n"), size = 3, label.padding = unit(0.5, "lines")) +
    geom_segment(aes(x = 35, y = -37.5, xend = 37, yend = -37.5)) +
  # The in situ sites
    geom_point(data = site_list, shape = 1,  size = 2.8, colour = "ivory") +
    geom_text(data = site_list, aes(label = order), size = 1.9, colour = "red") +
  # Oceans
    annotate("text", label = "INDIAN\nOCEAN", x = 37.00, y = -34.0, size = 4.0, angle = 0, colour = "ivory") +
    annotate("text", label = "ATLANTIC\nOCEAN", x = 13.10, y = -34.0, size = 4.0, angle = 0, colour = "ivory") +
  # Benguela
    geom_segment(aes(x = 17.2, y = -32.6, xend = 15.2, yend = -29.5),
                arrow = arrow(length = unit(0.3, "cm")), size = 0.5, colour = "ivory") +
    annotate("text", label = "Benguela", x = 16.0, y = -31.8, size = 3.5, angle = 298, colour = "ivory") +
  # Agulhas
    geom_segment(aes(x = 33, y = -29.5, xend = 29.8, yend = -33.0),
                arrow = arrow(length = unit(0.3, "cm")), size = 0.5, colour = "ivory") +
    annotate("text", label = "Agulhas", x = 31.7, y = -31.7, size = 3.5, angle = 53, colour = "ivory") +
  # Agulhas Bank
    annotate("text", label = "Agulhas\nBank", x = 22.5, y = -35.5, size = 3.0, angle = 0, colour = "ivory") +
  # Cape Peninsula
    annotate("text", label = "Cape\nPeninsula", x = 17.2, y = -35, size = 3.0, angle = 0, colour = "ivory") +
  # Improve on the x and y axis labels
    scale_x_continuous(breaks = seq(15, 35, 5),
                       labels = scales::unit_format(prefix = "°E", sep = ""),
                       position = "top") +
    scale_y_continuous(breaks = seq(-35, -30, 5),
                       labels = c("35°S", "30°S")) +
    labs(x = NULL, y = NULL) +
  # Slightly shrink the plotting area
    coord_cartesian(xlim = c(10.5, 39.5), ylim = c(-39.5, -25.5), expand = F) +
  # Use viridis colour scheme
    scale_fill_viridis(name = "Temp.\n(°C)", option = "D") +
  # Adjust the theme
    theme_bw() +
    theme(panel.border = element_rect(fill = NA, colour = "black", size = 1),
          axis.text = element_text(colour = "black"),
          axis.ticks = element_line(colour = "black"))

Many of the sites that need to be plotted are laying on top of each other. This is never good, but is made worse when the sites in question are refereed to frequently in the text. For this reason we need to create a little panel inside of the larger figure that shows a zoomed in picture of False Bay. Complete with text labels.

# False Bay inset
fb <- ggplot(data = sa_shore, aes(x = lon, y = lat)) +
  # The land mass
    geom_polygon(aes(group = PID),
                fill = "grey70", colour = NA, size = 0.5, show.legend = FALSE) +
  # The in situ sites
    geom_point(data = site_list, shape = 1,  size = 3, colour = "black") +
    geom_text(data = site_list, aes(label = order), size = 2.3, colour = "red") +
  # Text label
    geom_text(aes(x = 18.65, y = -34.25, label = "False\nBay"), size = 2.7) +
  # Control the x and y axes
    coord_cartesian(xlim = c(18.2, 19), ylim = c(-34.5, -33.8), expand = F) +
    scale_x_continuous(breaks = c(18.5), label = "18.5°E") +
    scale_y_continuous(breaks = c(-34.1), label = "34.1°S") +
    labs(x = NULL, y = NULL) +
  # Change the theme for cleaner over-plotting
    theme_bw() +
    theme(plot.background = element_blank(),
          axis.text = element_text(colour = "ivory"),
          axis.text.y = element_text(angle = 90, hjust = 0.5),
          axis.ticks = element_line(colour = "ivory"),
          panel.border = element_rect(colour = "ivory"),
          panel.grid = element_blank())

We could possibly create another inset panel for the clomp of sites around Hamburg but this figure is already getting too busy. So we’ll leave it for now. One inset panel will serve to illustrate the code necessary to create a faceted map so for the purposes of this post it will also suffice. That leaves us with only the bottom panel to create. The air state. I’ve decided to put the scale bar/ North arrow on this panel in an attempt to balance the amount of information in each panel.

# Establish the vector scalar for the wind
wind_uv_scalar <- 0.5

# The bottom figure (air)
mg_bottom <- ggplot(data = southern_africa_coast, aes(x = lon, y = lat)) +
  # The ocean temperature
    geom_raster(data = air_temp, aes(fill = value)) +
  # The land mass
    geom_polygon(aes(group = group), fill = NA, colour = "black", size = 0.5, show.legend = FALSE) +
    geom_path(data = africa_borders, aes(group = group)) +
  # The current vectors
    geom_segment(data = winds, aes(xend = lon + u * wind_uv_scalar, yend = lat + v * wind_uv_scalar),
                 arrow = arrow(angle = 15, length = unit(0.02, "inches"), type = "closed"), alpha = 0.4) +
  # The legend for the vector length
    geom_label(aes(x = 36, y = -37, label = "4.0 m/s\n"), size = 3, label.padding = unit(0.5, "lines")) +
    geom_segment(aes(x = 35, y = -37.5, xend = 37, yend = -37.5)) +
  # Improve on the x and y axis labels
    scale_x_continuous(breaks = seq(15, 35, 5),
                       labels = scales::unit_format(prefix = "°E", sep = "")) +
    scale_y_continuous(breaks = seq(-35, -30, 5),
                       labels = c("35°S", "30°S")) +
    labs(x = NULL, y = NULL) +
  # Scale bar
    scaleBar(lon = 13, lat = -38.0, distanceLon = 200, distanceLat = 50, distanceLegend = 90, dist.unit = "km",
             arrow.length = 200, arrow.distance = 130, arrow.North.size = 4) +
  # Slightly shrink the plotting area
    coord_cartesian(xlim = c(10.5, 39.5), ylim = c(-39.5, -25.5), expand = F) +
  # Use viridis colour scheme
    scale_fill_viridis(name = "Temp.\n(°C)", option = "A") +
  # Adjust the theme
    theme_bw() +
    theme(panel.border = element_rect(fill = NA, colour = "black", size = 1),
          axis.text = element_text(colour = "black"),
          axis.ticks = element_line(colour = "black"))

With our three pieces of the map complete, it is time to stick them together. There are many ways to do this but I have recently found that using annotation_custom allows one to stick any sort of ggplot like object onto any other sort of ggplot object. This is an exciting development and opens up a lot of doors for some pretty creative stuff. Here I will just use it to demonstrate simple faceting, but combined with panel gridding. Really though the sky is the limit.

# Convert the figures to grobs
mg_top_grob <- ggplotGrob(mg_top)
fb_grob <- ggplotGrob(fb)
mg_bottom_grob <- ggplotGrob(mg_bottom)

# Stick them together
gg <- ggplot() +
  # First set the x and y axis values so we know what the ranges are
  # in order to make it easier to place our facets
    coord_equal(xlim = c(1, 10), ylim = c(1, 10), expand = F) +
  # Then we place our facetsover one another using the coordinates we created
    annotation_custom(mg_top_grob,
                      xmin = 1, xmax = 10, ymin = 5.5, ymax = 10) +
    annotation_custom(fb_grob,
                      xmin = 3.5, xmax = 5.5, ymin = 7.2, ymax = 8.8) +
    annotation_custom(mg_bottom_grob,
                      xmin = 1, xmax = 10, ymin = 1, ymax = 5.5)

Summary

The developments in the gridding system have brought the potential for using ggplot for these more complex maps forward quite a bit. As long as one does not use a constrained mapping coordinate system (i.e. coord_fixed) the grob-ification of the ggplot objects seems to allow the placing of the pieces into a common area to be performed smoothly. Displaying many different bits of information cleanly is always a challenge. This figure is particularly busy, out of necessity. I think it turned out very nicely though.

Figure 1: Map showing the southern tip of the African continent. The top panel shows the typical sea surface temperature and surface currents on January 1st. The bottom panel likewise shows the typical surface air temperatures and winds on any given January 1st.

Related