Dynamic collaborative filtering Thompson Sampling for cross-domain advertisements recommendation

Paper Review

Paper
Review
Bandit
Advertising
Collaborative Filtering
Introducing (DCTS) The Dynamic Collaborative Filtering Thompson Sampling algorithm for cross-domain advertisements recommendation.
Published

Saturday, January 18, 2025

Keywords

Thompson Sampling, Recomender System, Collaborative Filtering

litrature review

So I don’t have much time for this today so here is a quick note on: (Ishikawa, Chung, and Hirate 2022)

TL;DR - Dynamic collaborative filtering via Thompson Sampling

DCTS in a nutshell
  1. The authors are using Thompson Sampling. This is a Bayesian method in RL.
  2. Thier problem is an advert recommendation system. So they are integrating Thompson sampling into making recommendations.
  3. The talk mentions a dataset the authors used for doing this work. Is this dataset available? I would like to try this out

One line on Thompson sampling, one of the oldest technique in the RL playbook which uses the following rule: pick an action at random from the posterior distribution of the action values and then use the outcome to update the posterior distribution for the next step.

My ideas
  • Find what data set was used.
  • Is this dataset available?
  • Can we make a minimal version to quickly test this kind of agent?
  • Figure out a framework that extends tompson sampling to other RL problems.
    • need to add P(action|state) i.e. add conditioning of the bernulli on the state.
    • prehaps do simple counts of steps since starts or last reward.
    • prehaps using a succeror representation can help
  • Marketing are the worst POMDPs. Testing real stuff is very hard so a good environment might help.
  • I want to make an petting zoo env to support single & multiagent:
    1. auctions / non autions
    2. advertising (rec sys) with costs
    3. pricing with policies.
    • It should also allow incorperating real data from a dataset. Diretly or via sampling
    • It would be even neater to do this using a heirarchiacal model.
    • It would be even better if we can also incorportate the product, user hierecies.

The Paper

paper

References

Ishikawa, Shion, Young-joo Chung, and Yu Hirate. 2022. “Dynamic Collaborative Filtering Thompson Sampling for Cross-Domain Advertisements Recommendation.” https://arxiv.org/abs/2208.11926.