Match alleles between summary statistics and SNP information. Match by ("chr", "a0", "a1") and ("pos" or "rsid"), accounting for possible strand flips and reverse reference alleles (opposite effects).
snp_match(
sumstats,
info_snp,
strand_flip = TRUE,
join_by_pos = TRUE,
remove_dups = TRUE,
match.min.prop = 0.2,
return_flip_and_rev = FALSE
)
A data frame with columns "chr", "pos", "a0", "a1" and "beta".
A data frame with columns "chr", "pos", "a0" and "a1".
Whether to try to flip strand? (default is TRUE
)
If so, ambiguous alleles A/T and C/G are removed.
Whether to join by chromosome and position (default), or instead by rsid.
Whether to remove duplicates (same physical position)?
Default is TRUE
.
Minimum proportion of variants in the smallest data
to be matched, otherwise stops with an error. Default is 20%
.
Whether to return internal boolean variables
"_FLIP_"
and "_REV_"
(whether the alleles were flipped and/or reversed).
Default is FALSE
. Values in column $beta
are multiplied by -1 for
variants with alleles reversed.
A single data frame with matched variants. Values in column $beta
are multiplied by -1 for variants with alleles reversed.
sumstats <- data.frame(
chr = 1,
pos = c(86303, 86331, 162463, 752566, 755890, 758144),
a0 = c("T", "G", "C", "A", "T", "G"),
a1 = c("G", "A", "T", "G", "A", "A"),
beta = c(-1.868, 0.250, -0.671, 2.112, 0.239, 1.272),
p = c(0.860, 0.346, 0.900, 0.456, 0.776, 0.383)
)
info_snp <- data.frame(
id = c("rs2949417", "rs115209712", "rs143399298", "rs3094315", "rs3115858"),
chr = 1,
pos = c(86303, 86331, 162463, 752566, 755890),
a0 = c("T", "A", "G", "A", "T"),
a1 = c("G", "G", "A", "G", "A")
)
snp_match(sumstats, info_snp)
#> 6 variants to be matched.
#> 1 ambiguous SNPs have been removed.
#> 4 variants have been matched; 1 were flipped and 1 were reversed.
#> chr pos a0 a1 beta p _NUM_ID_.ss id _NUM_ID_
#> 1 1 86303 T G -1.868 0.860 1 rs2949417 1
#> 2 1 86331 A G -0.250 0.346 2 rs115209712 2
#> 3 1 162463 G A -0.671 0.900 3 rs143399298 3
#> 4 1 752566 A G 2.112 0.456 4 rs3094315 4
snp_match(sumstats, info_snp, strand_flip = FALSE)
#> 6 variants to be matched.
#> 4 variants have been matched; 0 were flipped and 1 were reversed.
#> chr pos a0 a1 beta p _NUM_ID_.ss id _NUM_ID_
#> 1 1 86303 T G -1.868 0.860 1 rs2949417 1
#> 2 1 86331 A G -0.250 0.346 2 rs115209712 2
#> 3 1 752566 A G 2.112 0.456 4 rs3094315 4
#> 4 1 755890 T A 0.239 0.776 5 rs3115858 5