Why bother reviewing papers?
Ground Breaking v.s. Record Breaking
While all papers should have break new ground, I follow (Kuhn 1962) in differentiating between what in … considered a jump and a gradual improvement. Most papers that announce a new SOTA results are gradual improvements - essentially tweaks to well understood experiments while the truly ground-breaking papers are rare. Often, we see the authors taking an innovation from another field (perhaps a theoretical result) and applying it to a new domain. Less frequently, the author connects several previously unrelated results or techniques and does something considered impossible. Finally, there are papers where the authors ignore prior art and develop several original ideas, creating a new field. (Like Nash did with non-cooperative game theory)
They introduce new capabilities that aren’t widely used. e.g. in(Arora et al. 2018) the authors introduce sparse coding to embeddings. However, research from information science indicates that it can take a few years for results to circulate, become cited, and adopted. This paper often seems irrelevant later on when there is new research that can get more powerful results. Record-breaking papers (most common) beat the SOTAs by a few fractions of a point and typically arrive in black-box models, and their ‘black magic’ is neither fully understood nor transferable and hard to apply. So, going back to the paper that broke the ground, I find it interesting.
Learning to think like a researcher
The way top-tier researchers like Christopher Manning, Geoffrey Hinton, Christopher Bishop and David MacKay came up with their breakthroughs is frequently motivated and outlined in their papers. These can be as simple as making a PCA that works on Neural Networks for TSNE 1. In (Pennington, Socher, and Manning 2014) such as getting embedding to work using a covariance matrix rather than a moving window. Or as complicated as introducing causal regret for credit assignment for agents solving social dilemmas. In many ways, the intellectual journey is more fascinating than any specific magic trick.
1 you do know how regular PCA work right
Reviewing paper it not as good as taking a class with them. If you can do that. However, but teaching is often not the forte for most gifted researchers - sometimes their finest hours are when writing papers. Another point is that in the class teachers are very much limited by the material they can present - the students can only cover so much new mathematics in a class. In a paper they are writing to their peers so that all bets are off and the material can be as advanced as needed and touch on many disparate fields. In Skrym’s book on the evolution of signaling systems routinely touch on game theory, evolution, information theory, sociology, and classical philosophy. The reader left to catch up on their own if they can and to dive into the literature if they are dissatisfied with some of the the author’s claim. but most courses are not as advanced as most papers. I took courses by Christopher Manning or Geoffrey Hinton. However And yet, most of their publications are just as useful to review. It is not enough to read them if you want to assimilate some of their creativity or problem solving approaches. away in the sense that they provide a unique insight into the workings of their minds. It’s fascinating and inspiring to see how these influential figures think about problems and how they approach them. Their papers offer a valuable perspective that can’t be found elsewhere, and this understanding can be a great source of motivation for your own research. What I like about these two authors is that they can take some old ideas/techniques and figure out how to use them in new settings. T-SNE came from PCA, and GLOVE came from Topic Modelling.
I won’t say that ground-breaking papers are easy to read—yet some are written with great clarity, and others, like the LSTM papers, are notoriously difficult to understand. But you may discover that reading through the paper beats watching videos or blog posts by others.
Some points to consider when reviewing the paper.
- the big picture
- what is the main innovation
- what is new for you as a reader
- anything you feel left out.
- anything you disagree with or would have done differently.
Literature Reviews
Reading some papers in fast-moving areas like ML or deep learning can provide a good overview of recent developments. These tell how the field has changed and delineate the landmark approaches and the papers in which they arrived.
More Techniques
Some talented authors come with diverse backgrounds. They will list many fascinating ideas, algorithms and techniques. Taking a few minutes to check these out Another interesting aspect of many papers is use of algorithms or techniques that I am unfamiliar with.
List of Papers I want to look at
WordSense Disambiguation
- GlossBERT
- Beyond neural scaling laws: beating power law scaling via data pruning
- LLMs Will Always Hallucinate, and We Need to Live With This
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- MemLong: Memory-Augmented Retrieval for Long Text Modeling
- Do Transformer World Models Give Better Policy Gradients? RL
- ReMamba: Equip Mamba with Effective Long-Sequence Modeling
- Deep Learning Face Representation from Predicting 10,000 Classes
- DeepFace: Closing the Gap to Human-Level Performance in Face Verification
- Deep Learning Face Representation by Joint Identification-Verification
- Deep Face Recognition
- EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
- EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees
- DIFFUSION MODELS ARE REAL-TIME GAME ENGINES RL Image c.f. demos
- Platypus: A Generalized Specialist Model for Reading Text in Various Forms
- Approaching Deep Learning through the Spectral Dynamics of Weights
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
- Improving Transformers with Probabilistic Attention Keys
- SepNE: Bringing Separability to Network Embedding
Citation
@online{bochman2025,
author = {Bochman, Oren},
title = {Why Bother Reviewing Papers?},
date = {2025-01-13},
url = {https://orenbochman.github.io/reviews/meta/meta.html},
langid = {en}
}