Lesson 2 - Policy and Ethics for Experiments

Notes from Udacity A/B Testing course
a/b-testing
notes
Author

Oren Bochman

Published

Sunday, January 1, 2023

udacity

udacity

Notes from Udacity A/B Testing course.

Tuskegee and Milgram experiments

Some famous experiments where the experiments hurt the participants

  • the Tuskegee Syphilis Study where 500 participants were promised medical care but were not informed of their Syphilis diagnosis and did not get treatemnt although penecilin was available since 1942 and as a result over a 100 died while the study continued. The experiment also has no scientific merit as it was not conducted properly so as to permit collection and estimation of the statistics it claimed to study.
  • the Milgram experiment required participants to torture people by electrecuting them on a prompt from an authority figure. While the person getting tortured was an actor, many of participants were psychologically hurt by the experiment.
  • the Facebook experiment a massive psychological experiment at Facebook with 700,000 unwitting users considering how users are impacted by the content of their feed.

IRB or Institutional Review Board were setup in mostly academic institutions to certify that experiments were conducted according to ethical guidelines and to proect and inform the participants of the experiment.

It is a grey area as to whether many of these Internet studies should be subject to IRB review or not and whether informed consent is required. Neither has been common to date.

Four principles of A/B Test

Next we consider the principles of coonducting ethical A/B testing. The following are the four main principles that IRB’s look for.

Assessing Risk

what risk is the participant undertaking?

  1. Identifying the Risk It is crucial to determine the specific risks that participants might face when participating in an experiment or study. Risks can be categorized into various dimensions, including physical, psychological, emotional, social, and economic concerns. By identifying these risks, researchers can better understand the potential harm that participants might encounter.

  2. Minimal Risk The main threshold for assessing risk is whether it exceeds the concept of “minimal risk.” Minimal risk refers to the probability and magnitude of harm that a participant would typically encounter in their normal daily life. It serves as a baseline comparison to evaluate the level of risk involved in the study.

  3. Informed Consent If the identified risks go beyond the scope of minimal risk, obtaining informed consent from participants becomes necessary. Informed consent ensures that participants are fully aware of the potential risks involved in the study and voluntarily agree to participate. Researchers must provide clear and comprehensive information about the risks, benefits, and any other relevant details to enable participants to make an informed decision.

Benefits

  • What benefits might result from the study?
  • Even if the risk is minimal, how might the results help? In most online A/B testing, the benefits are around improving the product.
  • It is important to be able to state what the benefit would be from completing the study.

Alternatives

What other choices do participants have?

For example, if one istesting changes to a search engine, participants always have the choice to use another search engine. The fewer alternatives participants have, the more the test becomes coercive. Whether participants have a choice in whether to participate or not, and how that balances against the risks and benefits is at the crucx of the matter.

In online experiments, the issues to consider are what the other alternative services that a user might have, and what the switching costs might be, in terms of time, money, information, etc.

Data Sensitivity

Finally, what data is being collected, and what is the expectation of privacy and confidentiality

  • Do participants understand what data is being collected about them?

  • What harm would befall them should that data be made public?

  • Would they expect that data to be considered private and confidential?

  • For new data being collected and stored, how sensitive is the data, and what are the internal safeguards for handling that data? E.g., what access controls are there, how are breaches to that security caught and managed, etc.?

  • For that data, how will it be used and how will participants’ data be protected? How are participants guaranteed that their data, which was collected for use in the study, will not be used for some other purpose? This becomes more important as the sensitivity of the data increases.

  • What data may be published more broadly, and does that introduce any additional risk to the participants?

When new data is being gathered, then the questions come down to:

  • What data is being gathered? How sensitive is it? Does it include financial and health data?
  • Can the data being gathered be tied to the individual, i.e., is it considered personally identifiable?
  • How is the data being handled, with what security? What level of confidentiality can participants expect?
  • What harm would befall the individual should the data become public, where the harm would encompass health, psychological / emotional, social, and financial concerns?

Summary of Principles

Most studies, due to the nature of the online service, are likely minimal risk, and the bigger question is about data collection with regards to identifiability, privacy, and confidentiality / security. That said, arguably, a neutral third party outside of the company should be making these calls rather than someone with a vested interest in the outcome. One growing risk in online studies is that of bias and the potential for discrimination, such as differential pricing and whether that is discriminatory to a particular population for example.

The recommendation is that there should be internal reviews of all proposed studies by experts regarding the questions:

  • Are participants facing more than minimal risk?
  • Do participants understand what data is being gathered?
  • Is that data identifiable?
  • How is the data handled?
  • And if enough flags are raised, that an external review happen.

Reuse

CC SA BY-NC-ND

Citation

BibTeX citation:
@online{bochman2023,
  author = {Bochman, Oren},
  title = {Lesson 2 - {Policy} and {Ethics} for {Experiments}},
  date = {2023-01-01},
  url = {https://orenbochman.github.io/notes/AB-testing/l2.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2023. “Lesson 2 - Policy and Ethics for Experiments.” January 1, 2023. https://orenbochman.github.io/notes/AB-testing/l2.html.