View on GitHub

memo

On the Complexity of A/B Testing

On the Complexity of A/B Testing

There are several fomrmalizations of the model of obvervations.

The first case, two type of observations occrus at the (almost) same time. For instance, if you forward users accesing to web site to PageA or PageB randomly, this can be the case. The assumption behind the example is that the number of access users are large and the accesses frequently occur.

The second case, the observation does not happen at the same time. It’s always sequential and selective.

The best arm $a^{*}$ is the arm which has the best expectation.

\[a^{*} := \arg\min_{a \in \{1, 2\}} \mathrm{E} \left[ \sum_{s=1}^{t} X_{a, s} \right] .\]

The A/B testing in the sense of the multi armed-bandit is that to find $a^{*}$ in finite time \(\mathcal{T}\). The strategy of multi-armed bandit is how we find the best arm.

\((\{A_{s}\}, \hat{a})\) is strategy, denoted by $\mathcal{A}$.

The strategy $\mathcal{A}$ is said to be $\delta$-PAC if

\[P(\hat{a} = a^{*}) \ge 1 - \delta .\]

The A/B testing is

Reference