On the ifelse function

Written on October 15, 2016

In this post, I will talk about the ifelse function, which behaviour can be easily misunderstood, as pointed out in my latest question on SO. I will try to show how it can be used, and misued. We will also check if it is as fast as we could expect from a vectorized base function of R.

How can it be used?

The first example comes directly from the R documentation:

x <- c(6:-4)
sqrt(x)  #- gives warning

## Warning in sqrt(x): NaNs produced

##  [1] 2.449490 2.236068 2.000000 1.732051 1.414214 1.000000 0.000000      NaN      NaN
## [10]      NaN      NaN

sqrt(ifelse(x >= 0, x, NA))  # no warning

##  [1] 2.449490 2.236068 2.000000 1.732051 1.414214 1.000000 0.000000       NA       NA
## [10]       NA       NA

So, it can be used, for instance, to handle special cases, in a vectorized, succinct way.

The second example comes from the vignette of Rcpp Sugar:

foo <- function(x, y) {
  ifelse(x < y, x*x, -(y*y))
}
foo(1:5, 5:1)

## [1]  1  4 -9 -4 -1

So, it can be used to construct a vector, by doing an element-wise comparison of two vectors, and specifying a custom output for each comparison.

A last example, just for the pleasure:

(a <- matrix(1:9, 3, 3))

##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

ifelse(a %% 2 == 0, a, 0)

##      [,1] [,2] [,3]
## [1,]    0    4    0
## [2,]    2    0    8
## [3,]    0    6    0

How can it be misused?

I think many people think they can use ifelse as a shorter way of writing an if-then-else statement (this is a mistake I made). For example, I use:

legend.pos <- ifelse(is.top, ifelse(is.right, "topright", "topleft"),
                     ifelse(is.right, "bottomright", "bottomleft"))

instead of:

if (is.top) {
  if (is.right) {
    legend.pos <- "topright"
  } else {
    legend.pos <- "topleft"
  }
} else {
  if (is.right) {
    legend.pos <- "bottomright"
  } else {
    legend.pos <- "bottomleft"
  }
}

That works, but this doesn’t:

ifelse(FALSE, 0, 1:5)

## [1] 1

Indeed, if you read carefully the R documentation, you see that ifelse is returning a vector of the same length and attributes as the condition (here, of length 1).

If you really want to use a more succinct notation, you could use

`if`(FALSE, 0, 1:5)

## [1] 1 2 3 4 5

If you’re not familiar with this notation, I suggest you read the chapter about functions in book Advanced R.

Benchmarks

Reimplementing ‘abs’

abs2 <- function(x) {
  ifelse(x < 0, -x, x)
}
abs2(-5:5)

##  [1] 5 4 3 2 1 0 1 2 3 4 5

library(microbenchmark)
x <- rnorm(1e4)

print(microbenchmark(
  abs(x), 
  abs2(x)
))

## Unit: microseconds
##     expr     min       lq       mean   median      uq       max neval
##   abs(x)   3.973   5.2975   36.19779   6.9530   9.271  1613.386   100
##  abs2(x) 496.299 523.9450 1595.51016 549.7695 634.859 80076.957   100

Comparing with C++

Consider the Rcpp Sugar example again, 4 means to compute it:

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector fooRcpp(const NumericVector& x, const NumericVector& y) {
  int n = x.size();
  NumericVector res(n);
  double x_, y_;
  for (int i = 0; i < n; i++) { 
    x_ = x[i];
    y_ = y[i];
    if (x_ < y_) {
      res[i] = x_*x_;
    } else {
      res[i] = -(y_*y_);
    }
  }
  return res;
}

fooRcpp(1:5, 5:1)

## [1]  1  4 -9 -4 -1

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector fooRcppSugar(const NumericVector& x, 
                           const NumericVector& y) {
  return ifelse(x < y, x*x, -(y*y));
}

fooRcppSugar(1:5, 5:1)

## [1]  1  4 -9 -4 -1

foo2 <- function(x, y) {
  cond <- (x < y)
  cond * x^2 - (1 - cond) * y^2
}
foo2(1:5, 5:1)

## [1]  1  4 -9 -4 -1

x <- rnorm(1e4)
y <- rnorm(1e4)
print(microbenchmark(
  foo(x, y),
  foo2(x, y),
  fooRcpp(x, y),
  fooRcppSugar(x, y)
))

## Unit: microseconds
##                expr     min       lq      mean  median       uq      max neval
##           foo(x, y) 510.535 542.6510 872.23474 563.510 716.9680 2439.447   100
##          foo2(x, y)  71.183  75.1560 147.17468  83.765  93.8635 1977.250   100
##       fooRcpp(x, y)  40.393  44.6970  63.59186  47.676  51.1535 1468.038   100
##  fooRcppSugar(x, y) 138.394 141.3745 179.16429 142.533 161.4045 1575.972   100

Even if it is a vectorized base R function, ifelse is known to be slow.

Conclusion

Beware when you use the ifelse function. Moreover, if you make a substantial number of calls to it, be aware that it isn’t very fast, but it exists at least 3 faster alternatives to it.