My favorites | Sign in
Project Home Downloads Source
Project Information
Members
Featured
Downloads
Links

rbenchmark is a simple routine for benchmarking code written in R, a free software environment for statistical computing and graphics.

Table of contents

Summary

rbenchmark is inspired by the Perl module Benchmark, and is intended to facilitate benchmarking of arbitrary R code.

The library consists of just one function, benchmark, which is a simple wrapper around system.time.

Given a specification of the benchmarking process (counts of replications, evaluation environment) and an arbitrary number of expressions, benchmark evaluates each of the expressions in the specified environment, replicating the evaluation as many times as specified, and returning the results conveniently wrapped into a data frame.

Installation

rbenchmark can be installed as an R package complete with documentation by downloading the tar.gz package file (see the downloads section) and executing, at the command shell, the command

R CMD INSTALL rbenchmark_xx.tar.gz

where xx must be replaced with the appropriate version number.

rbenchmark can also be loaded by sourcing, in an active R session, the rbenchmark.r source file, either from your local directory or directly from Google:

# loading rbenchmark from the current directory
source('rbenchmark.r')

# loading benchmark directly from googlecode
source('http://rbenchmark.googlecode.com/svn/trunk/benchmark.r')

Consult the appropriate R documentation for information on installation of packages, if needed.

Specifications

Signature

benchmark has the following signature:

benchmark(..., replications, environment, columns, order, relative)

Parameters

benchmark has the following parameters:

  • replications is a numeric vector specifying how many times an expression should be evaluated when the runtime is measured. If replications consists of more than one value, each expression will be benchmarked multiple times, once for each value in replications.
  • environment is the environment in which the expressions will be evaluated.
  • columns is a character or integer vector specifying which columns should be included in the returned data frame.
  • order is a character or integer vector specifying which columns should be used to sort the data frame. Any of the columns that can be specified for columns (see above) can be used, even if it is not included in columns and will not appear in the output data frame.
  • `relative' is the name or index of the timing column used to calculate relative timings, with the lowest value in the specified column taken as the reference.
  • ... captures any number of unevaluated expressions passed to benchmark as named or unnamed arguments.

The parameters replications, environment, columns, and order are optional and have the following default values:

  • replications = 100
By default, each expression will be benchmarked once, and will be evaluated 100 times within the benchmark.
  • environment = parent.frame()
By default, all expressions will be evaluated in the environment in which the call to benchmark is made.
  • columns = c('test', 'replications', 'user.self', 'sys.self', 'elapsed', 'user.child', 'sys.child', 'relative')
By default, the returned data frame will contain all columns generated internally in benchmark. These named columns will contain the following data:
  • test: a character string naming each individual benchmark. If the corresponding expression was passed to benchmark in a named argument, the name will be used; otherwise, the expression itself converted to a character string will be used.
  • replications: a numeric vector specifying the number of replications used within each individual benchmark.
  • user.self, sys.self, elapsed, user.child, and sys.child are columns containing values reported by system.time; see Sec. 7.1 Operating system access in The R language definition, or type ?system.time in an R session.
  • `relative': relative timings.
  • order = 'test'
By default, the data frame is sorted by the column test (the labels of the expressions or the expressions themselves; see above).
  • relative = 'elapsed'
By default, relative timings are calculated based on the column elapsed.

Value

The value returned from a call to benchmark is a data frame with rows corresponding to individual benchmarks, and columns as specified above.

An individual benchmark corresponds to a unique combination (see below) of an expression from ... and a replication count from replications; if there are n expressions in ... and m replication counts in replication, the returned data frame will consist of n*m rows, each corresponding to an individual, independent (see below) benchmark.

If either ... or replications contain duplicates, the returned data frame will contain multiple benchmarks for the involved expression-replication combinations. Note that such multiple benchmarks for a particular expression-replication pair will, in general, have different timing results, since they will be evaluated independently (unless the expressions perform side effects that can influence each other's performance).

Examples

To see how rbenchmark works, you can copy-paste the examples, or source a demo file that will do this for you:

# loading benchmark examples directly from googlecode
source('http://rbenchmark.googlecode.com/svn/trunk/demo.r')

If you have installed rbenchmark as a package, you can run the demos by executing, in an R session, the commands

library(rbenchmark)
example(rbenchmark)
example(benchmark)

Example 1

A simple call to benchmark with just one expression and default values for replications, environment, columns, and order:

# benchmark the allocation of one 10^6-element numeric vector, replicated 100 times
benchmark(1:10^6)
Possible output:
    test replications user.self sys.self elapsed user.child sys.child
1 1:10^6          100       0.1     0.28   0.383          0         0

Example 2

A call to benchmark with two named expressions and three replication counts, output sorted by the replication counts and then by the elapsed time:

# benchmark the application of two functions with like functionality but different implementation
means.rep = function(n, m) mean(replicate(n, rnorm(m)))
means.pat = function(n, m) colMeans(array(rnorm(n*m), c(m, n)))
benchmark(
   rep=means.rep(100, 100), 
   pat=means.pat(100, 100), 
   replications=10^(1:3),
   order=c('replications', 'elapsed'))
Possible output:
  test replications user.self sys.self elapsed user.child sys.child
4  pat           10     0.020    0.000   0.017          0         0
1  rep           10     0.052    0.000   0.053          0         0
5  pat          100     0.168    0.004   0.174          0         0
2  rep          100     0.244    0.000   0.245          0         0
6  pat         1000     1.716    0.044   1.758          0         0
3  rep         1000     2.452    0.024   2.477          0         0

Example 3

A call to benchmark with duplicate expressions and replication counts, output with selected columns, additional column computed afterwards:

# six benchmarks for means.pat(100, 100), each with 100 replications
means.pat = function(n, m) colMeans(array(rnorm(n*m), c(m, n)))
within(
   benchmark(
      replications=rep(100, 3), 
      means.pat(100, 100), 
      means.pat(100, 100), 
      columns=c('test', 'elapsed', 'replications')),
   {average=elapsed/replications})
Possible output:
                 test elapsed replications average
1 means.pat(100, 100)   0.200          100 0.00174
2 means.pat(100, 100)   0.173          100 0.00173
3 means.pat(100, 100)   0.173          100 0.00173
4 means.pat(100, 100)   0.173          100 0.00173
5 means.pat(100, 100)   0.174          100 0.00174
6 means.pat(100, 100)   0.173          100 0.00173

Example 4

A call to benchmark with a list of arbitrary predefined expressions. Relative timings are based on the elapsed timings (the default, anyway):

# application of benchmark to a list of arbitrary expressions
means.rep = function(n, m) 
	mean(replicate(n, rnorm(m)))
means.pat = function(n, m) 
	colMeans(array(rnorm(n*m), c(m, n)))
tests = list(
   rep=expression(means.rep(100, 100)),
   pat=expression(means.pat(100, 100)))
result = do.call(benchmark, 
   c(tests, list(
      replications=100, 
      columns=c('test', 'elapsed', 'replications', 'relative'), 
      order='elapsed',
      relative='elapsed')))
Possible output:
  test elapsed replications relative
2  pat   0.174          100      1.0
1  rep   0.248          100     1.42

Notes

Not all expressions, if passed as unnamed arguments, will be cast to character strings as you might expect:

benchmark({x = 5; 1:x^x})
will output (modulo actual timings):
  test replications user.self sys.self elapsed user.child sys.child
1    {          100         0        0   0.002          0         0

benchmark performs no smart argument-parameter matching. Any named argument whose name is not exactly 'replications', 'environment', 'columns', 'order', or 'relative' will be treated as an expression to be benchmarked:

benchmark(1:10^5, repl=1000)
will output (modulo actual timings):
    test replications user.self sys.self elapsed user.child sys.child
1 1:10^5          100     0.032    0.012   0.047          0         0
2   repl          100     0.000    0.000   0.000          0         0

Author

Wacek Kusnierczyk, waku@idi.ntnu.no

Powered by Google Project Hosting