Author

Oren Bochman

Published

Tuesday, September 17, 2024

The missing link is my name for a set of agents that should be able to edit wikipedia or at least to significantly reduce the effort needed to contribute to wikipedia.

tasks:

  1. Wikification - use entropy maximize the entropy and mutual information of the wiki - i.e. choose links to other articles that are most likely to be clicked on rather than the the most most famous or like USA - which contributes no information to the reader.
  2. inlining citations
  3. adding missing references
  4. adding missing sections across languages
  5. Improving readability
    • most wikipedia articles are poorly written when compared with the best science writing in the world.
  6. Addressing biases and COI issues. [^we nay need to train the LLM on material that does not include wikipedia or to create a version that can separate wikipedia and non wikipedia material possibly using CLIP?]
    • with the advent of LLM we can now collect all the material in an articles Sources and use it to rewrite a more complete article and perhaps one with fewer biases.1 Further more it is fairly easy to source additional material from the web and other sources and thus again allowing a second view of the the articles point of view.
  7. Addressing vandalism and spam - this can be learned across articles
  8. Extracting wikidata from articles again this can be learned across many articles by mapping the article to the wikidata entries of the primary and secondary entities.
  9. Replace low register terms with high register terms - with an eye to improving readability. One hopes that the higher register terms are more precise and less ambiguous.
  10. Replace highly ambiguous terms with less ambiguous terms. The same perhaps for sentences.
  11. Make use of other media - diagrams, maths, code, images, videos, maps and so on should be more than referenced in the text.

1 LLM inherit and amplify biases from thier training material, so this aspect is an area of active research and may require some creativity

Citation

BibTeX citation:
@online{bochman2024,
  author = {Bochman, Oren},
  title = {LLM and the Missing Link},
  date = {2024-09-17},
  url = {https://orenbochman.github.io/posts/2024/2024-09-30-LLMs/the-missing-link.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2024. “LLM and the Missing Link.” September 17, 2024. https://orenbochman.github.io/posts/2024/2024-09-30-LLMs/the-missing-link.html.