Some scaling functions for a Filebacked Big Matrix to be used as the fun.scaling parameter of some functions of this package.

big_scale(center = TRUE, scale = TRUE)

Arguments

center

A logical value: whether to return means or 0s.

scale

A logical value: whether to return standard deviations or 1s. You can't use scale without using center.

Value

A new function that returns a data.frame of two vectors "center" and "scale" which are of the length of ind.col.

Details

One could think about less common scalings, such as for example the "y-aware" scaling which uses the inverse of betas of column-wise linear regression as scaling. See this post for details. It would be easy to implement it using big_colstats to get column means and big_univLinReg to get betas (and then inverse them).

See also

Examples

X <- big_attachExtdata()

# No scaling
big_noscale <- big_scale(center = FALSE, scale = FALSE)
class(big_noscale) # big_scale returns a new function
#> [1] "function"
str(big_noscale(X))
#> 'data.frame':	4542 obs. of  2 variables:
#>  $ center: num  0 0 0 0 0 0 0 0 0 0 ...
#>  $ scale : num  1 1 1 1 1 1 1 1 1 1 ...
big_noscale2 <- big_scale(center = FALSE)
str(big_noscale2(X)) # you can't scale without centering
#> 'data.frame':	4542 obs. of  2 variables:
#>  $ center: num  0 0 0 0 0 0 0 0 0 0 ...
#>  $ scale : num  1 1 1 1 1 1 1 1 1 1 ...

# Centering
big_center <- big_scale(scale = FALSE)
str(big_center(X))
#> 'data.frame':	4542 obs. of  2 variables:
#>  $ center: num  1.32 1.59 1.53 1.63 1.09 ...
#>  $ scale : num  1 1 1 1 1 1 1 1 1 1 ...
# + scaling
str(big_scale()(X))
#> 'data.frame':	4542 obs. of  2 variables:
#>  $ center: num  1.32 1.59 1.53 1.63 1.09 ...
#>  $ scale : num  0.679 0.569 0.627 0.558 0.719 ...