Quality Control (QC) and possible conversion to bed/bim/fam files using PLINK 1.9.

snp_plinkQC(
prefix.in,
file.type = "--bfile",
prefix.out = paste0(prefix.in, "_QC"),
maf = 0.01,
geno = 0.1,
mind = 0.1,
hwe = 1e-50,
autosome.only = FALSE,
extra.options = "",
verbose = TRUE
)

Arguments

plink.path Path to the executable of PLINK 1.9. Prefix (path without extension) of the dataset to be QCed. Type of the dataset to be QCed. Default is "--bfile" and corresponds to bed/bim/fam files. You can also use "--file" for ped/map files or "--vcf" for a VCF file. More information can be found at http://www.cog-genomics.org/plink/1.9/input. Prefix (path without extension) of the bed/bim/fam dataset to be created. Default is created by appending "_QC" to prefix.in. Minimum Minor Allele Frequency (MAF) for a SNP to be kept. Default is 0.01. Maximum proportion of missing values for a SNP to be kept. Default is 0.1. Maximum proportion of missing values for a sample to be kept. Default is 0.1. Filters out all variants which have Hardy-Weinberg equilibrium exact test p-value below the provided threshold. Default is 1e-50. Whether to exclude all unplaced and non-autosomal variants? Default is FALSE. Other options to be passed to PLINK as a string. More options can be found at http://www.cog-genomics.org/plink2/filter. If using PLINK 2.0, you could e.g. use "--king-cutoff 0.0884" to remove some related samples at the same time of quality controls. Whether to show PLINK log? Default is TRUE.

Value

The path of the newly created bedfile.

References

Chang, Christopher C, Carson C Chow, Laurent CAM Tellier, Shashaank Vattikuti, Shaun M Purcell, and James J Lee. 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4 (1): 7. http://dx.doi.org/10.1186/s13742-015-0047-8.

Examples

if (FALSE) {

bedfile <- system.file("extdata", "example.bed", package = "bigsnpr")
prefix  <- sub_bed(bedfile)
}