# Florian Privé

R(cpp) enthusiast

# One year as a subscriber to Stack Overflow

Written on July 2, 2018

In this post, I follow up on a previous post describing how last year in July, I spent one month mostly procrastinating on Stack Overflow (SO). We’re already in July so it’s time to get back to one year of activity on Stack Overflow.

Am I still as much active as before? What is my strategy for answering questions on SO?

## My activity on Stack Overflow

Again, we’ll use David Robinson’s package {stackr} to get data from Stack Overflow API in R.

# devtools::install_github("dgrtwo/stackr")
suppressMessages({
library(stackr)
library(tidyverse)
library(lubridate)
})

### Evolution of my SO reputation

myID <- "6103040"

myRep <- stack_users(myID, "reputation-history", num_pages = 40,
fromdate = today() - years(1))

myRep %>%
arrange(creation_date) %>%
ggplot(aes(creation_date, cumsum(reputation_change))) +
geom_point() +
labs(x = "Date", y = "Reputation (squared transformed)",
title = "Evolution of my SO reputation over the last year") +
bigstatsr::theme_bigstatsr()

So, it seems that my activity is slowing gently (my reputation is almost proportional to the square root of time). Yet, it is still increasing steadily; so what is my strategy for answering questions on SO?

### Tags I’m involved in

You’ll have to wait for the answer to what is my strategy for answering questions on SO. For a hint, let’s analyze the tags I’m involved in.

If we don’t count my first month of activity:

stack_users(myID, "tags", num_pages = 40,
fromdate = today() - months(11)) %>%
select(name, count) %>%
as_tibble()
## # A tibble: 155 x 2
##    name                count
##    <chr>               <int>
##  1 r                     187
##  2 performance            36
##  3 rcpp                   34
##  4 parallel-processing    33
##  5 foreach                19
##  6 r-bigmemory            14
##  7 vectorization          12
##  8 for-loop               11
##  9 matrix                 11
## 10 doparallel             10
## # ... with 145 more rows

I’m obviously answering only R questions. The tags I’m mostly answering questions from are “performance”, “rcpp”, “parallel-processing”, “foreach”, “r-bigmemory” and “vectorization”.

### Performance

As you can see, all these tags are about performance of code. I really enjoy performance problems (get the same result but much faster).

I can spend hours on a question about performance and am sometimes rewarded with a solution that is 2-3 order of magnitude faster (see e.g. this other post).

I hope I could share my knowledge about performance through a tutorial in Toulouse next year.

So, the question was “What is my strategy for answering questions on SO?”. And the answer is.. in the title: I am a subscriber.

I subscribe to tags on Stack Overflow. It has many benefits:

• you don’t have to rush to answer because questions you receive by mail are 30min-old (unanswered?) ones, so the probability that someone will answer at the same time as you is low.

• you can focus and what you’re good at, what you’re interested in, or just what you want to learn. For example, I subscribed to the very new tag “r-future” (for the R package {future}) because I’m interested in this package, even if I don’t know how to use it yet. I had the chance to meet with its author, Henrik Bengtsson, at eRum2018 and he actually already knew me through parallel questions on SO :D.

However, some tags (like “performance” or “foreach”) are relevant to many programming languages so that you would be flooded with irrelevant questions if subscribing directly to these tags. A simple solution to this problem is to subscribe to a feed of a combination of tags, like https://stackoverflow.com/feeds/tag?tagnames=r+and+foreach&sort=newest. I use this website to subscribe to feeds.

I will continue answering questions on SO, so see you there!

PS: I’m not sure you would get only unanswered questions with this technique.