Séminaire de Aurélien Delage

Exploiting structure in matroid combinatorial bandits

In this talk, we review the combinatorial semi-bandit problem under matroid constraints, where $E$ is the groundset. The regret achieved by recent approaches is optimal, in the sense that it matches the lower bound. Yet, time complexity remains an issue for large matroids, due to costly membership oracles (e.g. online recommendation that ensures diversity), or to KL indexes computations. This presentation sheds a new light on the matroid semi-bandit problem by showing how to exploit the underlying unimodal structure, and how to select the challenger to the current best arm in a randomized fashion. The interest is threefold: (i) at most two optimism computations are performed per round, instead of $\mathcal{O}(|E|)$, (ii) the number of interactions with the combinatorial structure is limited to at most O(loglog T), and (iii) the resulting algorithm only suffers negligible loss in regret. Experiments conducted on various matroid benchmarks show (i) no loss in regret compared to state-of-the-art approaches; and (ii) reduced time complexity and number of calls to the membership oracle, both by at least one order of magnitude.