Benchmarking R and Rcpp

A speed comparison.

November 10, 2019 Jason Mercer

7 minutes read

At the UW Data Science Center we’ve been talking speed lately. Generally R is considered a somewhat slow language (though I’m not sure that’s fair). However, there are some excellent packages and interfaces that can drastically speed up R. For example, the Rcpp package acts as an interface between R and C++, thereby allowing useRs to drastically speed-up analysis – or at least that’s the idea.

In this post I explore the relative speeds of equivalent R and Rcpp functions to see if we really can speed up an arbitrary function by using C++ on the back end.

Libraries

library(tidyverse)

## Warning: package 'tidyverse' was built under R version 3.6.2

## Warning: package 'tidyr' was built under R version 3.6.2

## Warning: package 'dplyr' was built under R version 3.6.2

library(Rcpp)
options(tibble.width = Inf)

R equations

First, I’m going to make some R equations. We don’t really care what these functions are for the purposes of this exercise. But it one is curious, these equations are very, very loosely based on specific yield relationships, as a function of water table depth and microtopography. The main function being test is: full_balance.

sy <- function(x, cond) {
  out <- ifelse(cond < x, 1, 0)
  out <- 1 - (sum(out) / length(x))
  ifelse(out <= 0.01, 0.01, out)
}

new_cond <- function(curr, addit, subtrac, microt) {
  sy_cond <- sy(microt, curr)
  curr + (addit + subtrac) / sy_cond
}

full_balance <- function(iterations, initial, addition, subtraction, add_sd,
  sub_sd, microtopo) {
  #Initialize
  N <- length(addition) #rows
  M <- iterations #columns
  out <- matrix(nrow = N, ncol = M)
  out[1,] <- initial
  add_error <- matrix(rep(addition, times = M) + rnorm(N * M, 0, add_sd),
    nrow = N, ncol = M, byrow = FALSE)
  sub_error <- matrix(rep(subtraction, times = M) + rnorm(N * M, 0, sub_sd),
    nrow = N, ncol = M, byrow = FALSE)
  #Execute
  for(i in 1:iterations) { #columns
    for(j in 2:length(addition)) { #rows
      out[j, i] <- new_cond(out[j - 1, i], add_error[j, i], sub_error[j, i],
        microtopo)
    }
  }
  #Return
  as.data.frame(out)
}

C++ (Rcpp) equations

Similar functions to those above, but implemented in C++ via the Rcpp package. I should note that I’m not a C++ programmer, so I tried my best to match the functions of the two languages. The main function being test is: full_balance_cpp.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double sy_cpp(NumericVector x, double cond) {
  //Declare
  int N = x.length();
  NumericVector cond_sum(N);
  double out;
  //First ifelse()
  // Short-hand form:
  for(int i = 0; i < N; i++) {
    cond_sum[i] = (cond < x[i] ? 1 : 0);
  }
  //`out` line
  out = 1 - (sum(cond_sum) / N);
  //Second ifelse()
  if(out <= 0.01) {
    out = 0.01;
  }
  //Explicit return statement
  return out;
}

// [[Rcpp::export]]
double new_cond_cpp(double curr, double addit, double subtrac,
  NumericVector microt) {
  //Declare
  double sy_cond;
  double out;
  //Execute
  sy_cond = sy_cpp(microt, curr);
  out = curr + (addit + subtrac) / sy_cond;
  //Return
  return out;
}

// [[Rcpp::export]]
DataFrame full_balance_cpp(int iterations, double initial,
  NumericVector addition, NumericVector subtraction, double add_sd,
  double sub_sd, NumericVector microtopo) {
  //Declare
  int N = addition.length(); //Rows
  int M = iterations; //Columns
  NumericMatrix add_error(N, M);
  NumericMatrix sub_error(N, M);
  NumericMatrix out_mat(N, M);
  
  //Initialize
  //Set initial conditions for each iteration
  for(int i = 0; i < M; ++i) {
    out_mat(0,i) = initial; //Note `out()` instead of `out[]` for indexing matrices
  }
  
  //Execute
  //Matrix of observations with error
  for(int i = 0; i < M; ++i) {
    add_error(_, i) = addition + rnorm(N, 0 , add_sd);
    sub_error(_, i) = subtraction + rnorm(N, 0 , add_sd);
  }
  //Loop through each set of observations
  for(int i = 0; i < M; ++i) {
    for(int j = 1; j < N; ++j){
      out_mat(j, i) = new_cond_cpp(out_mat(j - 1, i), add_error(j, i),
        sub_error(j, i), microtopo);
    }
  }
  
  //Return a dataframe of results
  //Explicit coercion to a different, compatable type following the pattern:
  // as<type>(object)
  DataFrame out = as<DataFrame>(out_mat);
  return out;
}

Data

Now let’s generate some data to use with our functions. Again, we don’t really care what the data actually means – it’s just an example.

per_obs <- 1:200
per_n <- length(per_obs)

add_means <- 0.5 * (sin(seq(0, 2 * per_n / 100 * pi, length.out = per_n))) +
  abs(rnorm(per_obs))
add_means <- ifelse(add_means < 0 , 0, add_means)
sub_means <- 0.75 * (cos(seq(0, 2 * per_n / 100 * pi, length.out = per_n)) - 2) +
  abs(rnorm(per_obs))
sub_means <- ifelse(sub_means > 0 , 0, sub_means)

add_sds <- 1
sub_sds <- 1

init_cond <- 0

micro <- map2(.x = c(-4, -2, 0, 2, 5), .y = c(3, 1, 2, 4, 1),
  ~ rnorm(2000, .x, .y)) %>%
  simplify()

But what does the data look like (just for fun)?

qplot(per_obs, add_means + sub_means) + theme_bw()

qplot(micro, geom = "density") + theme_bw()

Benchmarking

Here I’m using the bench package to iterate over different parameter sets, so we may get an idea of how the two equivalent functions respond to changes in function conditions. Pretty cool package.

Run the benchmarks

micro_maker <- function(obs) {
  map2(.x = c(-4, -2, 0, 2, 5), .y = c(3, 1, 2, 4, 1),
  ~ rnorm(obs, .x, .y)) %>%
  simplify()
}

iter_maker <- function(iters) {
  iters
}

bench_press <- bench::press(
  obs = c(10, 100, 1000),
  iters = c(1, 10, 100, 1000),
  {
    iterations = iter_maker(iters)
    microtopo = micro_maker(obs)
    bench::mark(
      R = full_balance(iterations, init_cond, add_means, sub_means, add_sds,
        sub_sds, microtopo),
      Rcpp = full_balance_cpp(iterations, init_cond, add_means, sub_means,
        add_sds, sub_sds, microtopo),
      check = FALSE,
      max_iterations = 20
    )
  }
)

Investigate the results of the benchmark

Now that we have the results, let’s take a quick look at the output.

bench_press %>%
  mutate(expression = attr(expression, "description")) %>%
  select(-`gc/sec`, -n_itr, -n_gc, -total_time) %>%
  print(n = nrow(.))

## # A tibble: 24 x 7
##    expression   obs iters       min    median `itr/sec`   mem_alloc
##    <chr>      <dbl> <dbl>     <dbl>     <dbl>     <dbl>       <dbl>
##  1 R             10     1  0.00190   0.00258   321.          770160
##  2 Rcpp          10     1  0.000187  0.000341 2311.          104088
##  3 R            100     1  0.00318   0.00377   234.         4627848
##  4 Rcpp         100     1  0.000361  0.000445 2112.          820488
##  5 R           1000     1  0.0236    0.0247     38.0       45122144
##  6 Rcpp        1000     1  0.00287   0.00319   208.         7984488
##  7 R             10    10  0.0173    0.0206     46.7        5913768
##  8 Rcpp          10    10  0.000719  0.000818 1092.         1016184
##  9 R            100    10  0.0410    0.0477     19.7       46701968
## 10 Rcpp         100    10  0.00335   0.00383   186.         8180184
## 11 R           1000    10  0.281     0.314       3.19     454901000
## 12 Rcpp        1000    10  0.0388    0.0446     22.5       79820184
## 13 R             10   100  0.197     0.200       4.78      58746024
## 14 Rcpp          10   100  0.00707   0.00755    89.3       10139688
## 15 R            100   100  0.611     0.611       1.64     462609712
## 16 Rcpp         100   100  0.0562    0.0687     14.4       81779688
## 17 R           1000   100  2.24      2.24        0.447   4594637656
## 18 Rcpp        1000   100  0.426     0.457       2.19     798179688
## 19 R             10  1000  2.23      2.23        0.448    588214760
## 20 Rcpp          10  1000  0.0655    0.0764      9.86     101370888
## 21 R            100  1000  4.29      4.29        0.233   4627849424
## 22 Rcpp         100  1000  0.387     0.394       2.54     817770888
## 23 R           1000  1000 23.3      23.3         0.0428 45325385200
## 24 Rcpp        1000  1000  3.77      3.77        0.265   7981770888

Interesting. Looks like Rcpp is, indeed, faster. Also interesting is the mem_alloc column. It seems that Rcpp uses much less memory.

Visualize the results of the benchmark

Finally, let’s take a look at a visual representation, comparing the two code sources across the different parameter sets. Points indicate the median iteration time. Note: variance in run times are so small that they cannot be seen at the scales provided.

bp_time_df <- bench_press %>%
  mutate(rows = 1:n(),
    expression = attr(expression, "description"))

bp_time_df %>%
  group_by(obs, iters, expression) %>%
  summarise(median = first(median)) %>%
  ungroup() %>%
  mutate(combo = paste0(obs, ": ", iters)) %>%
  arrange(median, combo) %>%
  mutate(combo = fct_inorder(combo)) %>%
  ggplot(aes(combo, median, fill = expression)) +
  geom_point(shape = 21, size = 4) +
  labs(x = "Observations: Iterations",
    y = "Iteration time (s)",
    fill = "Code\nsource:") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_fill_brewer(palette = "Set1")

It looks like there is a big spike in the differences between the two code sources as the number of “iterations” increases (not that I defined that term super well for you all, but as a relative metric, hopefully the gains in speed are at least relatively appreciated between R and Rcpp).

Wetlands and landscapes