Partial SVD (or PCA) of a genotype matrix stored as a PLINK (.bed) file.#'

bed_randomSVD(
  obj.bed,
  fun.scaling = bed_scaleBinom,
  ind.row = rows_along(obj.bed),
  ind.col = cols_along(obj.bed),
  k = 10,
  tol = 1e-04,
  verbose = FALSE,
  ncores = 1
)

Arguments

obj.bed

Object of type bed, which is the mapping of some bed file. Use obj.bed <- bed(bedfile) to get this object.

fun.scaling

A function with parameters X, ind.row and ind.col, and that returns a data.frame with $center and $scale for the columns corresponding to ind.col, to scale each of their elements such as followed: $$\frac{X_{i,j} - center_j}{scale_j}.$$ Default doesn't use any scaling. You can also provide your own center and scale by using as_scaling_fun().

ind.row

An optional vector of the row indices (individuals) that are used. If not specified, all rows are used.
Don't use negative indices.

ind.col

An optional vector of the column indices (SNPs) that are used. If not specified, all columns are used.
Don't use negative indices.

k

Number of singular vectors/values to compute. Default is 10. This algorithm should be used to compute only a few singular vectors/values.

tol

Precision parameter of svds. Default is 1e-4.

verbose

Should some progress be printed? Default is FALSE.

ncores

Number of cores used. Default doesn't use parallelism. You may use bigstatsr::nb_cores().

Value

A named list (an S3 class "big_SVD") of

  • d, the singular values,

  • u, the left singular vectors,

  • v, the right singular vectors,

  • niter, the number of the iteration of the algorithm,

  • nops, number of Matrix-Vector multiplications used,

  • center, the centering vector,

  • scale, the scaling vector.

Note that to obtain the Principal Components, you must use predict on the result. See examples.

Examples

bedfile <- system.file("extdata", "example.bed", package = "bigsnpr")
obj.bed <- bed(bedfile)

str(bed_randomSVD(obj.bed))
#> List of 7
#>  $ d     : num [1:10] 245.4 153.1 108.9 99.9 97.9 ...
#>  $ u     : num [1:517, 1:10] 0.0788 0.0804 0.0644 0.0777 0.0838 ...
#>  $ v     : num [1:4542, 1:10] 0.0032 -0.00161 0.02988 -0.01486 0.01259 ...
#>  $ niter : num 11
#>  $ nops  : num 182
#>  $ center: num [1:4542] 0.685 0.412 0.474 0.369 0.913 ...
#>  $ scale : num [1:4542] 0.671 0.572 0.601 0.549 0.704 ...
#>  - attr(*, "class")= chr "big_SVD"