Binomial(2, p) scaling — bed_scaleBinom • bigsnpr

Binomial(2, p) scaling where p is estimated.

bed_scaleBinom(
  obj.bed,
  ind.row = rows_along(obj.bed),
  ind.col = cols_along(obj.bed),
  ncores = 1
)

Arguments

obj.bed: Object of type bed, which is the mapping of some bed file. Use obj.bed <- bed(bedfile) to get this object.
ind.row: An optional vector of the row indices (individuals) that are used. If not specified, all rows are used.
Don't use negative indices.
ind.col: An optional vector of the column indices (SNPs) that are used. If not specified, all columns are used.
Don't use negative indices.
ncores: Number of cores used. Default doesn't use parallelism. You may use bigstatsr::nb_cores().

Value

A data frame with $center and $scale.

Details

You will probably not use this function as is but as parameter fun.scaling of other functions (e.g. bed_autoSVD and bed_randomSVD).

References

This scaling is widely used for SNP arrays. Patterson N, Price AL, Reich D (2006). Population Structure and Eigenanalysis. PLoS Genet 2(12): e190. doi:10.1371/journal.pgen.0020190 .

Examples

bedfile <- system.file("extdata", "example-missing.bed", package = "bigsnpr")
obj.bed <- bed(bedfile)

str(bed_scaleBinom(obj.bed))
#> 'data.frame':	500 obs. of  2 variables:
#>  $ center: num  0.0419 0.0829 0.1198 0.1744 0.2194 ...
#>  $ scale : num  0.203 0.282 0.336 0.399 0.442 ...

str(bed_randomSVD(obj.bed, bed_scaleBinom))
#> List of 7
#>  $ d     : num [1:10] 145.8 105.8 89.8 80 68.8 ...
#>  $ u     : num [1:200, 1:10] -0.10613 -0.03125 -0.02387 -0.00184 -0.0563 ...
#>  $ v     : num [1:500, 1:10] 0.0545 0.0518 0.0535 0.0551 0.04 ...
#>  $ niter : num 3
#>  $ nops  : num 76
#>  $ center: num [1:500] 0.0419 0.0829 0.1198 0.1744 0.2194 ...
#>  $ scale : num [1:500] 0.203 0.282 0.336 0.399 0.442 ...
#>  - attr(*, "class")= chr "big_SVD"