Wetlands and landscapes

Data science, water, isotopes, and ecosystems

Benchmarking R and Rcpp

A speed comparison.

At the UW Data Science Center we’ve been talking speed lately. Generally R is considered a somewhat slow language (though I’m not sure that’s fair). However, there are some excellent packages and interfaces that can drastically speed up R. For example, the `Rcpp` package acts as an interface between R and C++, thereby allowing useRs to drastically speed-up analysis – or at least that’s the idea.

In this post I explore the relative speeds of equivalent R and Rcpp functions to see if we really can speed up an arbitrary function by using C++ on the back end.

Libraries

``library(tidyverse)``
``## Warning: package 'tidyverse' was built under R version 3.6.2``
``## Warning: package 'tidyr' was built under R version 3.6.2``
``## Warning: package 'dplyr' was built under R version 3.6.2``
``````library(Rcpp)
options(tibble.width = Inf)``````

R equations

First, I’m going to make some R equations. We don’t really care what these functions are for the purposes of this exercise. But it one is curious, these equations are very, very loosely based on specific yield relationships, as a function of water table depth and microtopography. The main function being test is: `full_balance`.

``````sy <- function(x, cond) {
out <- ifelse(cond < x, 1, 0)
out <- 1 - (sum(out) / length(x))
ifelse(out <= 0.01, 0.01, out)
}

new_cond <- function(curr, addit, subtrac, microt) {
sy_cond <- sy(microt, curr)
curr + (addit + subtrac) / sy_cond
}

sub_sd, microtopo) {
#Initialize
M <- iterations #columns
out <- matrix(nrow = N, ncol = M)
out[1,] <- initial
nrow = N, ncol = M, byrow = FALSE)
sub_error <- matrix(rep(subtraction, times = M) + rnorm(N * M, 0, sub_sd),
nrow = N, ncol = M, byrow = FALSE)
#Execute
for(i in 1:iterations) { #columns
out[j, i] <- new_cond(out[j - 1, i], add_error[j, i], sub_error[j, i],
microtopo)
}
}
#Return
as.data.frame(out)
}``````

C++ (Rcpp) equations

Similar functions to those above, but implemented in C++ via the Rcpp package. I should note that I’m not a C++ programmer, so I tried my best to match the functions of the two languages. The main function being test is: `full_balance_cpp`.

``````#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double sy_cpp(NumericVector x, double cond) {
//Declare
int N = x.length();
NumericVector cond_sum(N);
double out;
//First ifelse()
// Short-hand form:
for(int i = 0; i < N; i++) {
cond_sum[i] = (cond < x[i] ? 1 : 0);
}
//`out` line
out = 1 - (sum(cond_sum) / N);
//Second ifelse()
if(out <= 0.01) {
out = 0.01;
}
//Explicit return statement
return out;
}

// [[Rcpp::export]]
double new_cond_cpp(double curr, double addit, double subtrac,
NumericVector microt) {
//Declare
double sy_cond;
double out;
//Execute
sy_cond = sy_cpp(microt, curr);
out = curr + (addit + subtrac) / sy_cond;
//Return
return out;
}

// [[Rcpp::export]]
DataFrame full_balance_cpp(int iterations, double initial,
double sub_sd, NumericVector microtopo) {
//Declare
int M = iterations; //Columns
NumericMatrix sub_error(N, M);
NumericMatrix out_mat(N, M);

//Initialize
//Set initial conditions for each iteration
for(int i = 0; i < M; ++i) {
out_mat(0,i) = initial; //Note `out()` instead of `out[]` for indexing matrices
}

//Execute
//Matrix of observations with error
for(int i = 0; i < M; ++i) {
sub_error(_, i) = subtraction + rnorm(N, 0 , add_sd);
}
//Loop through each set of observations
for(int i = 0; i < M; ++i) {
for(int j = 1; j < N; ++j){
out_mat(j, i) = new_cond_cpp(out_mat(j - 1, i), add_error(j, i),
sub_error(j, i), microtopo);
}
}

//Return a dataframe of results
//Explicit coercion to a different, compatable type following the pattern:
// as<type>(object)
DataFrame out = as<DataFrame>(out_mat);
return out;
}``````

Data

Now let’s generate some data to use with our functions. Again, we don’t really care what the data actually means – it’s just an example.

``````per_obs <- 1:200
per_n <- length(per_obs)

add_means <- 0.5 * (sin(seq(0, 2 * per_n / 100 * pi, length.out = per_n))) +
abs(rnorm(per_obs))
sub_means <- 0.75 * (cos(seq(0, 2 * per_n / 100 * pi, length.out = per_n)) - 2) +
abs(rnorm(per_obs))
sub_means <- ifelse(sub_means > 0 , 0, sub_means)

sub_sds <- 1

init_cond <- 0

micro <- map2(.x = c(-4, -2, 0, 2, 5), .y = c(3, 1, 2, 4, 1),
~ rnorm(2000, .x, .y)) %>%
simplify()``````

But what does the data look like (just for fun)?

``qplot(per_obs, add_means + sub_means) + theme_bw()``

``qplot(micro, geom = "density") + theme_bw()``

Benchmarking

Here I’m using the `bench` package to iterate over different parameter sets, so we may get an idea of how the two equivalent functions respond to changes in function conditions. Pretty cool package.

Run the benchmarks

``````micro_maker <- function(obs) {
map2(.x = c(-4, -2, 0, 2, 5), .y = c(3, 1, 2, 4, 1),
~ rnorm(obs, .x, .y)) %>%
simplify()
}

iter_maker <- function(iters) {
iters
}

bench_press <- bench::press(
obs = c(10, 100, 1000),
iters = c(1, 10, 100, 1000),
{
iterations = iter_maker(iters)
microtopo = micro_maker(obs)
bench::mark(
sub_sds, microtopo),
Rcpp = full_balance_cpp(iterations, init_cond, add_means, sub_means,
check = FALSE,
max_iterations = 20
)
}
)``````

Investigate the results of the benchmark

Now that we have the results, let’s take a quick look at the output.

``````bench_press %>%
mutate(expression = attr(expression, "description")) %>%
select(-`gc/sec`, -n_itr, -n_gc, -total_time) %>%
print(n = nrow(.))``````
``````## # A tibble: 24 x 7
##    expression   obs iters       min    median `itr/sec`   mem_alloc
##    <chr>      <dbl> <dbl>     <dbl>     <dbl>     <dbl>       <dbl>
##  1 R             10     1  0.00190   0.00258   321.          770160
##  2 Rcpp          10     1  0.000187  0.000341 2311.          104088
##  3 R            100     1  0.00318   0.00377   234.         4627848
##  4 Rcpp         100     1  0.000361  0.000445 2112.          820488
##  5 R           1000     1  0.0236    0.0247     38.0       45122144
##  6 Rcpp        1000     1  0.00287   0.00319   208.         7984488
##  7 R             10    10  0.0173    0.0206     46.7        5913768
##  8 Rcpp          10    10  0.000719  0.000818 1092.         1016184
##  9 R            100    10  0.0410    0.0477     19.7       46701968
## 10 Rcpp         100    10  0.00335   0.00383   186.         8180184
## 11 R           1000    10  0.281     0.314       3.19     454901000
## 12 Rcpp        1000    10  0.0388    0.0446     22.5       79820184
## 13 R             10   100  0.197     0.200       4.78      58746024
## 14 Rcpp          10   100  0.00707   0.00755    89.3       10139688
## 15 R            100   100  0.611     0.611       1.64     462609712
## 16 Rcpp         100   100  0.0562    0.0687     14.4       81779688
## 17 R           1000   100  2.24      2.24        0.447   4594637656
## 18 Rcpp        1000   100  0.426     0.457       2.19     798179688
## 19 R             10  1000  2.23      2.23        0.448    588214760
## 20 Rcpp          10  1000  0.0655    0.0764      9.86     101370888
## 21 R            100  1000  4.29      4.29        0.233   4627849424
## 22 Rcpp         100  1000  0.387     0.394       2.54     817770888
## 23 R           1000  1000 23.3      23.3         0.0428 45325385200
## 24 Rcpp        1000  1000  3.77      3.77        0.265   7981770888``````

Interesting. Looks like Rcpp is, indeed, faster. Also interesting is the `mem_alloc` column. It seems that Rcpp uses much less memory.

Visualize the results of the benchmark

Finally, let’s take a look at a visual representation, comparing the two code sources across the different parameter sets. Points indicate the median iteration time. Note: variance in run times are so small that they cannot be seen at the scales provided.

``````bp_time_df <- bench_press %>%
mutate(rows = 1:n(),
expression = attr(expression, "description"))

bp_time_df %>%
group_by(obs, iters, expression) %>%
summarise(median = first(median)) %>%
ungroup() %>%
mutate(combo = paste0(obs, ": ", iters)) %>%
arrange(median, combo) %>%
mutate(combo = fct_inorder(combo)) %>%
ggplot(aes(combo, median, fill = expression)) +
geom_point(shape = 21, size = 4) +
labs(x = "Observations: Iterations",
y = "Iteration time (s)",
fill = "Code\nsource:") +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_brewer(palette = "Set1")``````

It looks like there is a big spike in the differences between the two code sources as the number of “iterations” increases (not that I defined that term super well for you all, but as a relative metric, hopefully the gains in speed are at least relatively appreciated between R and Rcpp).