Compute the geometric median, i.e. the point that minimizes the sum of all
Euclidean distances to the observations (rows of U
).
geometric_median(U, tol = 1e-10, maxiter = 1000, by_grp = NULL)
A matrix (e.g. PC scores).
Convergence criterion. Default is 1e-10
.
Maximum number of iterations. Default is 1000
.
Possibly a vector for splitting rows of U
into groups before
computing the geometric mean for each group. Default is NULL
(ignored).
The geometric median of all rows of U
, a vector of the same size
as ncol(U)
. If providing by_grp
, then a matrix with rows being the
geometric median within each group.
X <- readRDS(system.file("testdata", "three-pops.rds", package = "bigutilsr"))
pop <- rep(1:3, c(143, 167, 207))
svd <- svds(scale(X), k = 5)
U <- sweep(svd$u, 2, svd$d, '*')
plot(U, col = pop, pch = 20)
med_all <- geometric_median(U)
points(t(med_all), pch = 20, col = "blue", cex = 4)
med_pop <- geometric_median(U, by_grp = pop)
points(med_pop, pch = 20, col = "blue", cex = 2)