Quidest?

Exploration vs Exploitation

ยท Lorenzo Drumond

Should I go for the decision that seems to be optimal, assuming that my current knowledge is reliable enough? Or should I go for a decision that seems to be sub-optimal for now, making the assumption that my knowledge could be inaccurate and that gathering new information could help me to improve it?

Exploitation consists of taking the decision assumed to be optimal given the data in our possession so far.

Exploration consists of taking a sub-optimal choice with the explicit goal of collecting more data in order to make a better and more informed decision in the future.

This dilemma appears in every observation based decision-making process where there is a feedback loop (observation drives decision, decision drives new observations).

This kind of online learning process is required either when there is not enough data to train a model (cold start problem) or when data evolved through time (non-stationary problem).

Either way, the data we have is not enough to identify the best decision with 100% certainty.

References

Next -> multi-armed-bandits-framework

#medium #exploration #exploitation #statistics #math #tradeoff