Oren Bochman’s Blog
Home
About
Source Code
Report a Bug
Archive
Notes
Reinforcement Learning
Notes
NLP Specialization
Neural Networks for Machine Learning
Model Thinking
XAI
Reinforcement Learning
Rhetoric
TFP
AB testing
Cognitive AI (Udacity)
Paper Reviews
Reviews
1989
Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment
1991
Simplifying Neural Networks by soft weight sharing
1999
2007
2009
Evolutionary dynamics of Lewis signaling games: signaling systems vs. partial pooling
2011
2012
Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors
Multi-column Deep Neural Networks for Image Classification
ImageNet Classification with Deep Convolutional Neural Networks
2013
NIN — Network in Network
Handwriting beautification using token means
2014
VGGNet: Very Deep Convolutional Networks for Large-Scale Image Recognition
Dropout: A Simple Way to Prevent Neural Networks from Overfitting
2015
sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings
Variational Inference with Normalizing Flows
2016
2018
2019
2020
Why Overfitting Isn’t Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
ViT — An Image is worth 16x16 words: Transformers for Image Recognition at scale
2021
Emergent Communication of Generalizations
2022
2023
Temporal Abstraction in Reinforcement Learning with the Successor Representation
2024
2BP: 2-Stage Backpropagation
Tree Attention: Topology-Aware Decoding for Long-Context Attention on GPU Clusters
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
TheoremLlama An End-To-End Framework to Train a General-Purpose Large Language Model to Become a Lean4 Expert
MambaVision A Hybrid Mamba-Transformer Vision Backbone
FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
2025
Meta
Why bother reviewing papers?
Parsing
Dependency Parsing using Deep Learning
Blog
Posts
2011
Tidy Text Mining With R
Time management Tips
Text Mining With R
Text Mining With Python
2011 11 29 Framer
2012
Wikisym 2012
2013
2013 06 07 Lifehacks
life hacks
2014
2014 10 06 FinTech
2015
2015 02 07 Analytics Checklist
Analytics Checklist
2015 02 07 Optimal Bidding
2015 04 20 All Things Data
HotJar Heat Map Analysis - Dr. David Darmanin
Using Competitive Analysis to Benchmark Your Marketing Efforts Ariel Rosenstein - Similar Web
Using Competitive Analysis to Benchmark Your Marketing Efforts - Ariel Rosenstein - Similar Web
2016
2016 12 14 Travel Checklist
Travel checklist
2017
2017 02 02 Mostly Harmless Econometrics Review
2017 07 30 Experimental Design
A/B testing cost and risks?
2018
2018 01 16 BRAT
text annotation with BRAT
2019
2019 07 31 Exploding and Vanishing Nodes
Exploding and vanishing nodes.
2019 11 24 Keys to the Kingdom Extracting Api Keys from a Json File with Jq
Docker for data science
2020
brace expansion
2020 02 20 Avoid Cross Site Scriptin Errors with a Jupyter Local Runtime
How to avoid cross site scripting (XSS) errors with the Jupyter local runtime for Colab
2020 03 04 Pandas Challanges
Pandas Productivity Challenge?
2020 04 10 Pdf Extraction
2020 10 25 Deep Learning Relu Intutions
Deep Learning Intuitions
2020 11 29 Numpy Meltdown
numpy melt down
2020 12 30 Meme Bank
Meme bank
2021
2021 03 21 Review of Effective Approaches to Attention Based Neural Machine Translation
Effective Approaches to Attention-based NMT
2021 03 21 Review of Language Models Are Open Knowledge Graphs
Language Models Are Open Knowledge Graphs
2021 04 03 Ruby Installation Snafus
Jekyll take 3
2021 04 06 Jekyll Mathjax 3.0 Fix
MathJax 3 fix for Jekyll hosted on Github pages
2021 04 07 Linkage
Linkage 2021-04-07
2021 04 07 Ten Tips to Improve Your Workflow
10 Tips To Improve Your Workflow
2021 04 08 Other People Problems
2021 04 09 Modeling Events
Modeling Events
2021 04 11 Lexical and Semantic Features
2021 04 24 Summerization
Automatic Summarization Task
2021 04 25 Bayesian Agent
Bayesian agents
tikz in Quarto!
2021 04 27 Wingrad Schema
Q&A and the Winograd schemas
2021 05 16 Mulitlevel Models
Multilevel Models
2021 05 18 Bayesian Betting
2021 05 29 Djvu to Pdf
Ebook Hacks
2021 06 10 Layout Models
TensorFlow probability
2021 07 01 Json Ld
json-ld
2021 07 14 Type Witness Evolving Idiom
A type of Witness and an evolving Idiom
2021 08 13 Hackathon Notes
Hackathon session link dumps & notes
2021 08 13 Inlining Citations
Inlining Citations for Wikipedia articles
2021 08 13 Tea House Chatbot
2021 08 17 Algorithmic Intuition
2021 08 18 Transfer Learning Nlp
Transfer learning in NLP
2021 09 08 Stochastic Gradient Descent
Stochastic Gradient Descent - The good parts
2021 09 08 Wave Net Review
WaveNet Review
2021 09 14 Marketing Models
2021 09 16 Python Graphs
Python Graphs
2021 09 22 What Is in a Citation
What is in a citation?
2021 09 24 Statistics for Marketing in Python
Excel 2019 for Marketing Statistics in pandas
2021 09 27 Some Thoughts About Digital Advertising Agencies
2021 10 15 Storytelling and Other Essentials
Storytelling and other essentials
2021 10 24 Computer Age Statistical Inference Notes
2021 11 08 Advanced ML Workflows
2021 11 12 Language Models and Explainability
Language models and explainability
2021 12 07 Attention for Sensor Fusion
Attention for sensor fusion
2022
2022 03 05 M1
Set Up M1 MacBooks for DS & ML
2022 04 01 Bandits
2022 05 05 Command Line
command line
2022 09 12 Robust Regression
2022 09 16 Adaptive Learning Rate
2022 09 16 Loss Engineering
2022 09 22 Entropy for Uncertainty Quantification
entropy for uncertainty quantification
2023
The Great Migration
AutoGluon Cheetsheets
2023-01-11-NLP-IL-Intuit Meetup
2023 02 01 Ds from Scratch
OLS regression From Scratch
2023 02 20 Ts Nonlinear
2023 02 28 NLP.IL Booking.com
Text2topic Leverage reviews data for multi-label topics classification in Booking.com
Validating NLP data and models
2023 03 01 Braindump
2023 03 01 Spark Emr
2023 03 08 Responsible AI
2023 04 11 Quarto Loves Psdocode
Quarto loves pseudocode
2023 04 22 Mcmc Algs
MCMC algorithms
2023 06 01 Spark
Spark Tips
2023 06 01 Synthesis and Stabilization
Summary: Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization
S3 Series
2024
BITEXT AND ALIGNMENT
😁 Quarto 💖 Mermaid🧜 Mindmaps 🧠
Evolutionary Games and Population Dynamics Summary
RL MindMap
Post With Code
Sine function
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
Books, Courses Tools
Transformations in Linguistic Representation
SuperLearner
Generally Capable Agents Emerge from Open-Ended Play
Six quick tips to improve modeling
Stumpy
readings in rl
2023 02 16 Learing Goals
2023 03 16 Events Generator
event generator
2024 01 02 D3
D3.js in in Quarto Observable
2024 02 01 Quarto Bootstrap
Quarto 💖 Bootstrap 😁
2024 02 19 Rhetoric
Rhetoric NLP Tasks
2024 02 28 Ocr
OCR building blocks
OCR - Brain Dump
2024 03 03 a Definition by P Winston
A definition by Patrick Henry Winston
2024 03 31 Gradio
gradio local model
2024 03 31 Sugarscapes
Mesa Lessons
2024 04 03 Focus
RAD REPL
2024 05 01 Signals
Lewis Game from a Bayesian Perspective
ad hoc complex signaling systems
Signals Experiment
emergent complex communications protocols
Skryms Signals Summary and Models
Skryms Signals Summary Bibliography
Urn models using Numpy
2024 05 02 Signaling Games Tikz
Complete pooling
lewis game
Bayesian Gaussian mixture model
2024 05 09 RE RL
Roth Erev learning in Lewis signaling games
2024 06 01 Bayesian Agents
2024 06 03 Sk Repl
Semantic Kernel
Semantic Kernel
2024 06 06 Notes in Time of War
2024 06 11
logic puzzles
2024 06 13 Hyper
Hyperparameter Optimization
2024 06 23 Zero Inflated Data
zero inflated data
2024 06 25 Mesa Rl
mesa & rl
2024 07 01 Generalization in ML
two ideas on generelization
replay buffer questions
2024 10 10 Marco Baoni Composionality
Rethinking Signaling systems via the lens of compositionality
Is compositionality overrated? The view from language emergence
2024 10 18 Compositon A Guide For The Perplexed
Compositionality in Lewis signaling games and MARL transfer learning.
2024 10 18 Desiderata For Emergent Languages
Emergent Languages - A Desiderata
LLM the good the bad and the ugly
deduction evaluation
Fine-tune llm for Style and Grammar advice.
NLP with RL
LLM and the missing link
The Many Path To A Signaling System
Podcast
Misbehaviour of Markets and Scaling in financial prices 1-4
Scaling in financial prices 1
Scaling in financial prices 4
Scaling in financial prices 2
Scaling in financial prices 3
Notes
Reinforcement Learning
RL
Published
Friday, December 20, 2024
Order By
Default
Date - Oldest
Date - Newest
Title
Policy Gradient
Are tasks really ever continuing? Everything eventually breaks or dies. It’s clear that individual people do not learn from death, but we don’t live forever. Why might the…
Oren Bochman
Thursday, April 4, 2024
Control with Approximation
In this video Adam White discusses the Episodic SARSA algorithm with function approximation. He explains how this algorithm can be used to solve reinforcement learning…
Oren Bochman
Wednesday, April 3, 2024
Constructing Features for Prediction
This is not a video lecture or notes for a learning goal. This is however my attempts to cover some material from the readings from chapter 9 of
(Sutton and Barto 2018)
menti…
Oren Bochman
Tuesday, April 2, 2024
Constructing Features for Prediction
We discussed methods for representing large, an possibly continuous state spaces. Ways to construct features. A representation is an agent’s internal encoding of the state…
Oren Bochman
Tuesday, April 2, 2024
On-Policy Prediction with Approximation
Some of the notes I made in this course became a bit too long. Rather than break the flow of the lesson I decided to move them to a separate file. This is one of those notes.
Oren Bochman
Monday, April 1, 2024
On-Policy Prediction with Approximation
I did not find the derivation of the SGD alg particularly enlightening and I have seen it several times. However the online setting is the best motivation for the use of SGD…
Oren Bochman
Monday, April 1, 2024
Sample-based Learning Methods
In these module we define cover model based RL sampling. We start with the Dyna architecture. Then we consider tabular Q-planning algorithm, the Tabular Dyna-Q and Dyna-Q+…
Oren Bochman
Monday, March 4, 2024
Temporal Difference Learning Methods for Control
This week, we will learn to using TD learning for control, as a generalized policy iteration strategy. We will see three different algorithms based on bootstrapping and…
Oren Bochman
Sunday, March 3, 2024
Temporal Difference Learning Methods for Prediction
In these unit we define some key terms like rewards, states, action, value functions, action values functions. Then we consider at the the multi-armed bandit problem leading…
Oren Bochman
Saturday, March 2, 2024
Monte-Carlo Methods for Prediction & Control
In this module we learn about Sample based MC methods that allow learning from sampled episodes. We revise our initial algorithm to better handle exploration. In off policy…
Oren Bochman
Friday, March 1, 2024
Dynamic Programming
In week 4 we learn how to compute value functions and optimal policies, assuming you have the MDP model. You will implement dynamic programming to compute value functions…
Oren Bochman
Thursday, May 5, 2022
Value Functions & Bellman Equations
In week 3 we learn about Value Functions and Bellman Equations, which are the key technology behind all the algorithms we will learn. We learn the definition of policies and…
Oren Bochman
Wednesday, May 4, 2022
Markov Decision Processes
In week 2 we learn about Markov Decision Processes (MDP) and how to compute value functions and optimal policies, assuming you have the MDP model. We implement dynamic…
Oren Bochman
Tuesday, May 3, 2022
The K-Armed Bandit Problem
In week 1 we define some key concepts like rewards, states, action, value functions, action values functions. We consider the the multi-armed bandit problem, leading to…
Oren Bochman
Monday, May 2, 2022
Course Introduction
In week 1 we define some key concepts like rewards, states, action, value functions, action values functions. We consider the the multi-armed bandit problem, leading to…
Oren Bochman
Sunday, May 1, 2022
No matching items
XAI
Rhetoric