All Reviews

title

teaser for reading this paper

🧠 Theory-Based Bayesian Models of Inductive Learning and Reasoning

How do humans make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the…

Multi-column Deep Neural Networks for Image Classification

In (Cireşan, Meier, and Schmidhuber 2012) titled “Multi-column Deep Neural Networks for Image Classification”, the authors, Dan Cireşan, Ueli Meier, Juergen Schmidhuber…

Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors

In (Hinton et al. 2012) titled “Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors”, the authors, Hinton, Geoffrey E., Nitish Srivastava, Alex…

ImageNet Classification with Deep Convolutional Neural Networks

(Krizhevsky, Sutskever, and Hinton 2012) is a seminal paper in the field of deep learning. It introduced the AlexNet architecture, which won the ImageNet Large Scale Visual…

Handwriting beautification using token means

In (Zitnick 2013) the author shows how we can use a model for beautifying handwriting. The problem raised is that there is lots of variation in handwriting for a single…

NIN — Network in Network

In (Lin, Chen, and Yan 2014) the authors, Lin, Min, Qiang Chen, and Shuicheng Yan, of this paper titled “Network in Network” paper came up with a way of connencting somee…

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

In (Srivastava et al. 2014) the authors, present a novel regularization technique for deep neural networks called “dropout.” The key idea behind dropout is to randomly drop…

Some dynamics of signaling games

teaser for reading this paper

ViT — An Image is worth 16x16 words: Transformers for Image Recognition at scale

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision…

Temporal Abstraction in Reinforcement Learning with the Successor Representation

This paper review is an extended introduction to temporal abstraction using options. It covers lots of advanced concepts in reinforcement learning that were introduced in…

FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications

Acquiring the desired font for various design tasks can be challenging and requires professional typographic knowledge. While previous font retrieval or generation works…

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

In (BehnamGhader et al. 2024) the authors consider using LLMs which are mostly decoder only transformers as text encoders. This allows them to use the LLMs for NLP tasks…

MambaVision A Hybrid Mamba-Transformer Vision Backbone

In (Hatamizadeh and Kautz 2024), the authors apply the State Space Model (SSM) inherent in recently introduced Mamba architecture, (Gu and Dao 2023), for vision tasks. They…

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

The announcement of the GPT-5 strawberry model has sparked a lot of interest in this paper which seems to be the theory behind Open.ai’s new model.

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

in (Bansal et al. 2024) the authors consider the trade-offs between generating synthetic data using a stronger but more expensive (SE) model versus a weaker but cheaper (WC)…

TheoremLlama An End-To-End Framework to Train a General-Purpose Large Language Model to Become a Lean4 Expert

Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving…

Tree Attention: Topology-Aware Decoding for Long-Context Attention on GPU Clusters

in (Shyam et al. 2024) the authors propose a new algorithm for parallelizing attention computation across multiple GPUs. This enables cross-device decoding to be performed…

2BP: 2-Stage Backpropagation

in (Shyam et al. 2024) the authors …

Why bother reviewing papers?

Why bother reviewing papers?

🏒 Hockey Helmets, Concealed Weapons, and Daylight Saving – Binary Choices with Externalities

Thomas Schelling’s 1973 article explores binary choices where one person’s decision affects others, called by economists as externalities

Tuesday, April 1, 2025

Goal Inference as Inverse Planning

How do could RL agents infer the goals of other agents?

Monday, March 31, 2025

Simulation as an engine of physical scene understanding

Model for a cognitive mechanism similar to computer engines that simulate rich physics in video games and graphics, but that uses approximate, probabilistic simulations to…

Monday, March 31, 2025

Learning Shape Priors for Single-View 3D Completion and Reconstruction

Learning priors for Shapes

Saturday, March 29, 2025

🤝 Costly Signaling and Cooperation

teaser for reading this paper

Monday, March 24, 2025

🗣️ Talking to Neighbors: Evolution of Regional Meaning in Communication Games

Zollman’s paper on how adding spatial structure affects the evolutionary outcomes of games with emergent communication and social cooperation

Thursday, March 13, 2025

On Learning To Become a Successful Loser

I tracked this paper due to it being highlighted in (Skyrms 2010) as the source of a model that learns a signaling systems faster. I got me started with the loss domain. I…

Thursday, January 2, 2025

Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input

Review of ‘Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input’ by Angeliki Lazaridou et al.

Wednesday, January 1, 2025

Linguistic generalization and compositionality in modern artificial neural networks

A review of the paper ‘Linguistic generalization and compositionality in modern artificial neural networks’ by Marco Baroni.

Wednesday, January 1, 2025

Compositionality and Generalization in Emergent Languages

Very exciting - this is a paper with a lot of interesting ideas. It comes with a a lot of code in the form of a library called EGG as well as many JuPyteR notebooks. There…

Wednesday, January 1, 2025

Emergent Communication of Generalizations

I think this is an amazing paper. I read it critically and made copious notes to see what I could learn from it. The paper point out some limitations of Lewis referential…

Wednesday, October 9, 2024

Evolutionary dynamics of Lewis signaling games: signaling systems vs. partial pooling

(Huttegger et al. 2010)

Tuesday, October 8, 2024

Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries

This paper, “Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries,” challenges the traditional view that overfitting is inherently…

Tuesday, June 11, 2024

The Evolution of Coding in signaling games

This paper considers a setting for the evolution of a complex signaling systems

Monday, June 10, 2024

️👮 Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Matrix games like Prisoner’s Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the choice to cooperate or defect as an atomic…

Monday, June 10, 2024

Generally Capable Agents Emerge from Open-Ended Play

The paper does not present a breakthrough like alpha go zero etc. But it shows very high level of creativity and innovation. I am still a new comer to RL and this paper has…

Monday, June 10, 2024

Signals: Signals: Evolution, Learning, and Information

In (Skyrms 2010) philosopher and mathematician Brian Skyrms discusses how one can extend the concept of a signaling games into a full-fledged signaling systems and to some…

Wednesday, May 1, 2024

🦚 Honest Signalling Made Simple

How can we ensure signals are honest in a world where deception is rewarded? This paper delves into the theory of honest signalling in animal behavior, specifically…

Thursday, March 14, 2024

🤝 Diversity Bonuses: How Diverse Teams Achieve Superior Results

teaser for reading this paper

Sunday, October 1, 2023

sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings

Sense2Vec (Trask, Michalak, and Liu 2015) is an interesting deep learning model based on word2vec that can learn more interesting and detailed word vectors from large…

Sunday, June 26, 2022

Variational Inference with Normalizing Flows

The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of…

Sunday, June 26, 2022

Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment

This Nips 1988 paper is about simplifying neural networks by removing redundant units. The authors’ approach is systematically identifying and removing redundant or…

Wednesday, June 22, 2022

Simplifying Neural Networks by soft weight sharing

This paper was mentioned in Geoffrey Hinton’s Coursera course as a way to simplify neural networks.

Wednesday, June 22, 2022

VGGNet: Very Deep Convolutional Networks for Large-Scale Image Recognition

In this paper (Simonyan and Zisserman 2015) the authors, Karen Simonyan and Andrew Zisserman from the Visual Geometry Group at Oxford, investigated the effect of increasing…

Thursday, December 10, 2015

🧠 Technical Introduction: A primer on probabilistic inference

How to model human cognition using probabilistic inference

Friday, March 20, 2015