Dynamic collaborative filtering Thompson Sampling for cross-domain advertisements recommendation

Paper Review

Paper

Review

Bandit

Advertising

Collaborative Filtering

Introducing (DCTS) The Dynamic Collaborative Filtering Thompson Sampling algorithm for cross-domain advertisements recommendation.

Published

Saturday, January 18, 2025

Keywords

Thompson Sampling, Recomender System, Collaborative Filtering

So I don’t have much time for this today so here is a quick note on: (Ishikawa, Chung, and Hirate 2022)

TL;DR - Dynamic collaborative filtering via Thompson Sampling

The authors are using Thompson Sampling. This is a Bayesian method in RL.
Thier problem is an advert recommendation system. So they are integrating Thompson sampling into making recommendations.
The talk mentions a dataset the authors used for doing this work. Is this dataset available? I would like to try this out

One line on Thompson sampling, one of the oldest technique in the RL playbook which uses the following rule: pick an action at random from the posterior distribution of the action values and then use the outcome to update the posterior distribution for the next step.

My ideas

Find what data set was used.
- Rakuten Ichiba data
- Rakuten Travel dataset
Is this dataset available?
Can we make a minimal version to quickly test this kind of agent?
Figure out a framework that extends tompson sampling to other RL problems.
- need to add P(action|state) i.e. add conditioning of the bernulli on the state.
- prehaps do simple counts of steps since starts or last reward.
- prehaps using a succeror representation can help
Marketing are the worst POMDPs. Testing real stuff is very hard so a good environment might help.
I want to make an petting zoo env to support single & multiagent:
1. auctions / non autions
2. advertising (rec sys) with costs
3. pricing with policies.
- It should also allow incorperating real data from a dataset. Diretly or via sampling
- It would be even neater to do this using a heirarchiacal model.
- It would be even better if we can also incorportate the product, user hierecies.

The Paper

paper

References

Ishikawa, Shion, Young-joo Chung, and Yu Hirate. 2022. “Dynamic Collaborative Filtering Thompson Sampling for Cross-Domain Advertisements Recommendation.” https://arxiv.org/abs/2208.11926.