Polygenic Risk Scores for a grid of clumping and thresholding parameters.
Stacking over many Polygenic Risk Scores, corresponding to a grid of many different parameters for clumping and thresholding.
snp_grid_clumping( G, infos.chr, infos.pos, lpS, ind.row = rows_along(G), grid.thr.r2 = c(0.01, 0.05, 0.1, 0.2, 0.5, 0.8, 0.95), grid.base.size = c(50, 100, 200, 500), infos.imp = rep(1, ncol(G)), grid.thr.imp = 1, groups = list(cols_along(G)), exclude = NULL, ncores = 1 ) snp_grid_PRS( G, all_keep, betas, lpS, n_thr_lpS = 50, grid.lpS.thr = 0.9999 * seq_log(max(0.1, min(lpS, na.rm = TRUE)), max(lpS, na.rm = TRUE), n_thr_lpS), ind.row = rows_along(G), backingfile = tempfile(), type = c("float", "double"), ncores = 1 ) snp_grid_stacking( multi_PRS, y.train, alphas = c(1, 0.01, 1e04), ncores = 1, ... )
G  A FBM.code256
(typically 

infos.chr  Vector of integers specifying each SNP's chromosome. 
infos.pos  Vector of integers specifying the physical position
on a chromosome (in base pairs) of each SNP. 
lpS  Numeric vector of 
ind.row  An optional vector of the row indices (individuals) that
are used. If not specified, all rows are used. 
grid.thr.r2  Grid of thresholds over the squared correlation between
two SNPs for clumping. Default is 
grid.base.size  Grid for base window sizes. Sizes are then computed as

infos.imp  Vector of imputation scores. Default is all 
grid.thr.imp  Grid of thresholds over 
groups  List of vectors of indices to define your own categories. This could be used e.g. to derive C+T scores using two different GWAS summary statistics, or to include other information such as functional annotations. Default just makes one group with all variants. 
exclude  Vector of SNP indices to exclude anyway. 
ncores  Number of cores used. Default doesn't use parallelism. You may use nb_cores. 
all_keep  Output of 
betas  Numeric vector of weights (effect sizes from GWAS) associated
with each variant (column of 
n_thr_lpS  Length for default 
grid.lpS.thr  Sequence of thresholds to apply on 
backingfile  Prefix for backingfiles where to store scores of C+T. As we typically use a large grid, this can result in a large matrix so that we store it on disk. Default uses a temporary file. 
type  Type of backingfile values. Either 
multi_PRS  Output of 
y.train  Vector of phenotypes. If there are two levels (binary 0/1),
it uses 
alphas  Vector of values for gridsearch. See 
...  Other parameters to be passed to 
snp_grid_PRS()
: An FBM
(matrix on disk) that stores the C+T scores
for all parameters of the grid (and for each chromosome separately).
It also stores as attributes the input parameters all_keep
, betas
,
lpS
and grid.lpS.thr
that are also needed in snp_grid_stacking()
.