Outlier detection based on departure from histogram. Suitable for compact values (need a space between main values and outliers).
hist_out(x, breaks = nclass.scottRob, pmax_out = 0.2, nboot = NULL)Numeric vector (with compact values).
Same parameter as for hist(). Default uses a robust version
of Scott's rule. You can also use "FD" or nclass.FD for a bit more bins.
Percentage at each side that can be considered outliers at
each step. Default is 0.2.
Number of bootstrap replicates to estimate limits more robustly.
Default is NULL (no bootstrap, even if I would recommend to use it).
A list with
x: the initial vector, whose outliers have been removed,
lim: lower and upper limits for outlier removal,
all_lim: all bootstrap replicates for lim (if nboot not NULL).
set.seed(1)
x <- rnorm(1000)
str(hist_out(x))
#> List of 2
#> $ x : num [1:1000] -0.626 0.184 -0.836 1.595 0.33 ...
#> $ lim: num [1:2] -Inf Inf
# Easy to separate
x2 <- c(x, rnorm(50, mean = 7))
hist(x2, breaks = nclass.scottRob)
str(hist_out(x2))
#> List of 2
#> $ x : num [1:1000] -0.626 0.184 -0.836 1.595 0.33 ...
#> $ lim: num [1:2] -Inf 4.25
# More difficult to separate
x3 <- c(x, rnorm(50, mean = 6))
hist(x3, breaks = nclass.scottRob)
str(hist_out(x3))
#> List of 2
#> $ x : num [1:1050] -0.626 0.184 -0.836 1.595 0.33 ...
#> $ lim: num [1:2] -Inf Inf
str(hist_out(x3, nboot = 999))
#> List of 3
#> $ x : num [1:1007] -0.626 0.184 -0.836 1.595 0.33 ...
#> $ lim : num [1:2] -Inf 4.75
#> $ all_lim: num [1:2, 1:999] -Inf 3.25 -Inf 3.25 -Inf ...