bignspr is an R Package for the analysis of massive SNP arrays. It enhances the features of package bigstatsr for the purpose of analysing genotype data.
For now, you can install this package using
For now, this package only read bed/bim/fam files (PLINK preferred format) using
snp_readBed. Before reading into this package’s special format, quality control and conversion can be done using PLINK, which can be called directly from R using
I use a class called
bigSNP for representing infos on massive SNP arrays. One
bigSNP has at least 3 elements:
BM.code.descriptorwhich describes a special
big.matrix(see package bigstatsr). Rows are samples and columns are SNPs. This corresponds to the “bed” file, but each element is encoded on 8 bits rather than only 2 bits for PLINK files, which allows for storing more information, without taking too much disk space.
data.framecontaining some information on the SNPs (read from the “.fam” file).
data.framegiving some information on the individuals (read from the “.bim” file).
Please open an issue if you find a bug. If you want help using bigmemory or bigstatsr, please post on Stack Overflow with the tag r-bigmemory. How to make a great R reproducible example?
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.