Oren Bochman’s Blog
  • Home
  • About
    • Source Code
    • Report a Bug
  • Archive
  1. Archive
  • Notes
    • Bayesian Specialization
    • Reinforcement Learning Specialization
    • Model Thinking
    • NLP Specialization
    • AB testing

  • Posts
    • Posts
      • 2011
        • Text Mining With Python
        • Text Mining With R
        • Time management Tips
        • Tidy Text Mining With R
      • 2012
        • Wikisym 2012
      • 2013
        • life hacks
      • 2014
        • 2014 10 06 FinTech
      • 2015
        • 2015 02 07 Analytics Checklist
          • Analytics Checklist
        • 2015 02 07 Optimal Bidding
        • 2015 04 20 All Things Data
          • HotJar Heat Map Analysis - Dr. David Darmanin
          • Using Competitive Analysis to Benchmark Your Marketing Efforts Ariel Rosenstein - Similar Web
          • Using Competitive Analysis to Benchmark Your Marketing Efforts - Ariel Rosenstein - Similar Web
      • 2016
        • Travel checklist
      • 2017
        • A/B testing cost and risks?
      • 2018
        • text annotation with BRAT
      • 2019
        • Exploding and vanishing nodes.
        • Docker for data science
      • 2020
        • Meme bank
        • Deep Learning Intuitions
        • brace expansion
        • How to avoid cross site scripting (XSS) errors with the Jupyter local runtime for Colab
        • Pandas Productivity Challenge?
        • numpy melt down
      • 2021
        • Customer Lifetime Value - Pareto/NBD (BTYD) Model
        • Excel 2019 for Marketing Statistics in pandas
        • TensorFlow probability
        • Storytelling and other essentials
        • Transfer learning in NLP
        • json-ld
        • Getting more from your agency ?
        • Language models and explainability
        • Advertising Models
        • Inlining Citations for Wikipedia articles
        • Python Graphs
        • A type of Witness and an evolving Idiom
        • What is in a citation?
        • Ebook Hacks
        • Q&A and the Winograd schemas
        • Hackathon session link dumps & notes
        • Modeling Events
        • Automatic Summarization Task
        • Bayesian agents
        • WaveNet
        • Multilevel Models
        • 10 Tips To Improve Your Workflow
        • 2021 12 07 Attention for Sensor Fusion
          • Attention for sensor fusion
      • 2022
        • Robust Regression
        • Set Up M1 MacBooks for DS & ML
        • 2022 04 01 Bandits
        • 2022 05 05 Command Line
          • command line
        • 2022 09 16 Adaptive Learning Rate
        • 2022 09 16 Loss Engineering
          • Loss engineering and uncertainty for multi-task learning
        • 2022 09 22 Entropy for Uncertainty Quantification
          • entropy for uncertainty quantification
      • 2023
        • The Great Migration
        • Quarto loves pseudocode
        • MCMC algorithms
        • AutoGluon Cheetsheets
        • 2023 02 01 Ds from Scratch
          • OLS regression From Scratch
        • 2023 02 20 Ts Nonlinear
        • 2023 02 28 NLP.IL Booking.com
          • Text2topic Leverage reviews data for multi-label topics classification in Booking.com
          • Validating NLP data and models
        • 2023 03 01 Braindump
        • 2023 03 01 Spark Emr
        • 2023 03 08 Responsible AI
        • 2023 06 01 Spark
          • Spark Tips
        • 2023 06 01 Synthesis and Stabilization
          • Summary: Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization
        • S3 Series
      • 2024
        • event generator
        • two ideas on generalization
        • Six quick tips to improve modeling
        • Evolutionary Games and Population Dynamics Summary
        • D3.js in in Quarto Observable
        • Mesa Lessons
        • SuperLearner
        • Risk-constrained Markov decision processes
        • RAD REPL
        • More Sugar please
        • LLM and the missing link
        • Villeny pure and simple
        • OCR - Brain Dump
        • LLM the good the bad and the ugly
        • OCR building blocks
        • Fine-tune llm for Style and Grammar advice.
        • Shannon Game
        • Sugar Scapes
        • replay buffer questions
        • Lewis Game from a Bayesian Perspective
        • Vitter’s Algorithm
        • Post With Code
        • ad hoc complex signaling systems
        • 😁 Quarto 💖 Mermaid🧜 Mindmaps 🧠
        • Signals Experiment
        • Understanding Emergent Languages
        • A definition by Patrick Henry Winston
        • TL-DR rethinking 💭 topological alignment
        • Lewis Signaling Game for PettingZoo
        • Stumpy
        • readings in rl
        • NLP with RL
        • Transformations in Linguistic Representation
        • Deduction Evaluation
        • Is compositionality overrated? The view from language emergence
        • 2024 02 01 Quarto Bootstrap
        • 2024 02 19 Rhetoric
          • Rhetoric NLP Tasks
        • 2024 05 02 Signaling Games Tikz
        • 2024 05 03 Urn Models
          • Urn models using Numpy
        • 2024 05 04 Signals Bib
        • 2024 05 09 Roth Erev RL
          • Roth Erev learning in Lewis signaling games
        • 2024 06 01 Bayesian Agents
        • 2024 06 12 Logic Puzzles
        • 2024 06 13 Hyper
          • Hyperparameter Optimization
        • 2024 06 23 Zero Inflated Data
          • zero inflated data
        • 2024 06 25 Mesa Rl
          • Mesa & RL
        • Misbehavior of Markets and Scaling in financial prices 1-4
          • Scaling in financial prices 4
          • Scaling in financial prices 1
          • Scaling in financial prices 3
          • Scaling in financial prices 2
      • 2025
        • Off-Policy Learning
        • Where Have All the Metrics Gone?
        • Python Meets Excel: Smarter Workflows for Analysts and Data Teams
        • Automating ML with PyCaret: Train & Compare Multiple Models to Find the Best Performer
        • torchTextClassifiers : Modernizing Text classification for French National Statistics
        • From Ideas to APIs: Delivering Fast with Modern Python
        • The Referential Lewis Signaling Game
        • Using MCP to turn Claude into a Football Opposition Analyst
        • Garbage In, Lawsuit Out: Building Compliant and Reproducible ML Pipelines
        • Rethinking Signaling systems via the lens of compositionality
        • Complex Signals Questions
        • When AI Makes Things Up: Understanding and Tackling Hallucinations
        • Beyond Just Prediction: Causal Thinking in Machine Learning
        • Harnessing Generative Models for Synthetic Non-Life Insurance Data
        • GPU Accelerated Zarr
        • Updates to the github action
        • FlexAttention: A Flexible Approach to Attention Mechanisms
        • A garden of forking paths
        • PyData/Sparse & Finch - Extending sparse computing in the Python ecosystem
        • Building LLM-Powered Applications for Data Scientists and Software Engineers
        • FlexAttention: A Flexible Approach to Attention Mechanisms
        • Planning in the Complex Lewis Game
        • Base line Morphology Model
        • Engineering Reinforcement Learning Algorithms
        • projspec: what’s this project anyway?
        • Combining Zarr, HDF5, and TIFF into a single data format
        • Reviving Survival Analysis: Timeless, Yet Overlooked?
        • Vibe coding GPT5 Edition
        • Optimal Variable Binning in Logistic Regression
        • How to Effectively use text embeddings in tree based models
        • GPU Python for the Real World: Practical Steps to GPU-Accelerated Python with RAPIDS
        • Scaling Fuzzy Product Matching with BM25: A Comparative Study of Python and Database Solutions
        • Complex Lewis Signaling - The Research Questions
        • Stochastic gradient Descent – a Deep Dive
        • TinyTroupe: Enhancing Marketing Insights through LLM-Powered Multiagent Persona Simulation
        • Books, Courses Tools
        • AI a bag of tricks
        • Time series analysis for coupled neurons.
        • When the Meter Maxes Out: Chernobyl Disaster Lessons for ML Systems in Production
        • Lessons learnt in optimizing a large-scale pandas application using Polars, FireDucks and cuDF: Go Smart and Save More!
        • I Built a Transformer from Scratch So You Don’t Have To
        • FlexAttention: A Flexible Approach to Attention Mechanisms
        • Anthropic’s Claude: Advancing AI with Safety and Scalability
        • Python Worst Practices - Learn from the Expert
        • How Big are SLMs
        • Using Traditional AI and LLMs to Automate Complex and Critical Documents in Healthcare
        • LLMs, Chatbots, and Dashboards: Visualize Your Data with Natural Language
        • Optimizing AI/ML Workloads: Resource Management and Cost Attribution
        • From Feature Engineering to Context Engineering for Agents
        • The roles of Partial pooling and mixed strategies in the Lewis signaling game
        • ShinyLive ❤️ Mesa Tutorial
        • Hands-on with Blosc2: Accelerating Your Python Data Workflows
        • Emergent Languages
        • Bodo DataFrames: a fast and scalable HPC-based drop-in replacement for Pandas
        • Allegations of War Crimes and the Palestinian Genocide
        • Designing a Fast, Offline-Capable Reverse Geocoder in Python: An Open Source Alternative to Big Geo APIs
        • Building a Lightweight Feature Store for Electricity Grid Forecasts with Polars
        • Lessons in Decision Making from the Monty Hall Problem
        • Probabilitic Modeling with Language Models
        • The Lifecycle of a Jupyter Environment - From Exploration to Production-Grade Pipelines
        • Langtalks Resources # 43
        • Decisions Under Uncertainty: A Hands‑On Guide to Bayesian Decision Theory
        • Marketing Mix Model
        • Realtime Financial Fraud Detection with Modern Python
        • Scaling Data Processing for LLMs with NeMo Curator
        • The Many Path To A Signaling System
          • Podcast
      • 2026
        • The Art of Discipline: Mastering Self-Control for a Fulfilling Life
        • Sick
        • Laws of Large Numbers
        • RL Agents Last All Summer Long
        • CI - Libs for Python
        • Convoluted Intuitions
        • Rise of the agents
        • KV Cache Efficiency & Context Platform Engineering
        • NVIDIA Nsight Systems GPU profiling
        • Three modes of convergence of random variables
        • Robert Ness on LLMs & CI

Archive

CI - Libs for Python

Sunday, February 15, 2026

Robert Ness on LLMs & CI

Saturday, February 14, 2026

The Art of Discipline: Mastering Self-Control for a Fulfilling Life

Wednesday, January 21, 2026

KV Cache Efficiency & Context Platform Engineering

Monday, January 19, 2026

NVIDIA Nsight Systems GPU profiling

Monday, January 19, 2026

RL Agents Last All Summer Long

Wednesday, January 14, 2026

Rise of the agents

Wednesday, January 14, 2026

Three modes of convergence of random variables

Friday, January 9, 2026

Convoluted Intuitions

Saturday, January 3, 2026

Laws of Large Numbers

Friday, January 2, 2026

Sick

Thursday, January 1, 2026

Anthropic’s Claude: Advancing AI with Safety and Scalability

Sunday, December 14, 2025

Combining Zarr, HDF5, and TIFF into a single data format

Friday, December 12, 2025

GPU Accelerated Zarr

Friday, December 12, 2025

Bodo DataFrames: a fast and scalable HPC-based drop-in replacement for Pandas

Friday, December 12, 2025

Beyond Just Prediction: Causal Thinking in Machine Learning

Friday, December 12, 2025

Garbage In, Lawsuit Out: Building Compliant and Reproducible ML Pipelines

Friday, December 12, 2025

Building a Lightweight Feature Store for Electricity Grid Forecasts with Polars

Friday, December 12, 2025

GPU Python for the Real World: Practical Steps to GPU-Accelerated Python with RAPIDS

Friday, December 12, 2025

Scaling Data Processing for LLMs with NeMo Curator

Friday, December 12, 2025

PyData/Sparse & Finch - Extending sparse computing in the Python ecosystem

Friday, December 12, 2025

TinyTroupe: Enhancing Marketing Insights through LLM-Powered Multiagent Persona Simulation

Friday, December 12, 2025

How to Effectively use text embeddings in tree based models

Friday, December 12, 2025

When the Meter Maxes Out: Chernobyl Disaster Lessons for ML Systems in Production

Friday, December 12, 2025

Automating ML with PyCaret: Train & Compare Multiple Models to Find the Best Performer

Thursday, December 11, 2025

How Big are SLMs

Thursday, December 11, 2025

Hands-on with Blosc2: Accelerating Your Python Data Workflows

Wednesday, December 10, 2025

Decisions Under Uncertainty: A Hands‑On Guide to Bayesian Decision Theory

Wednesday, December 10, 2025

From Ideas to APIs: Delivering Fast with Modern Python

Wednesday, December 10, 2025

Lessons in Decision Making from the Monty Hall Problem

Wednesday, December 10, 2025

Optimal Variable Binning in Logistic Regression

Wednesday, December 10, 2025

Optimizing AI/ML Workloads: Resource Management and Cost Attribution

Wednesday, December 10, 2025

Realtime Financial Fraud Detection with Modern Python

Wednesday, December 10, 2025

Reviving Survival Analysis: Timeless, Yet Overlooked?

Wednesday, December 10, 2025

Time series analysis for coupled neurons.

Wednesday, December 10, 2025

Using MCP to turn Claude into a Football Opposition Analyst

Wednesday, December 10, 2025

Using Traditional AI and LLMs to Automate Complex and Critical Documents in Healthcare

Tuesday, December 9, 2025

Building LLM-Powered Applications for Data Scientists and Software Engineers

Tuesday, December 9, 2025

From Feature Engineering to Context Engineering for Agents

Tuesday, December 9, 2025

I Built a Transformer from Scratch So You Don’t Have To

Tuesday, December 9, 2025

The Lifecycle of a Jupyter Environment - From Exploration to Production-Grade Pipelines

Tuesday, December 9, 2025

LLMs, Chatbots, and Dashboards: Visualize Your Data with Natural Language

Tuesday, December 9, 2025

Lessons learnt in optimizing a large-scale pandas application using Polars, FireDucks and cuDF: Go Smart and Save More!

Tuesday, December 9, 2025

projspec: what’s this project anyway?

Tuesday, December 9, 2025

Python Meets Excel: Smarter Workflows for Analysts and Data Teams

Tuesday, December 9, 2025

Python Worst Practices - Learn from the Expert

Tuesday, December 9, 2025

Designing a Fast, Offline-Capable Reverse Geocoder in Python: An Open Source Alternative to Big Geo APIs

Tuesday, December 9, 2025

Scaling Fuzzy Product Matching with BM25: A Comparative Study of Python and Database Solutions

Tuesday, December 9, 2025

Harnessing Generative Models for Synthetic Non-Life Insurance Data

Tuesday, December 9, 2025

torchTextClassifiers : Modernizing Text classification for French National Statistics

Tuesday, December 9, 2025

When AI Makes Things Up: Understanding and Tackling Hallucinations

Tuesday, December 9, 2025

Where Have All the Metrics Gone?

Tuesday, December 9, 2025

Stochastic gradient Descent – a Deep Dive

Thursday, October 9, 2025

Updates to the github action

Wednesday, October 1, 2025

Vibe coding GPT5 Edition

Thursday, September 25, 2025

FlexAttention: A Flexible Approach to Attention Mechanisms

Saturday, September 20, 2025

FlexAttention: A Flexible Approach to Attention Mechanisms

Saturday, September 20, 2025

FlexAttention: A Flexible Approach to Attention Mechanisms

Saturday, September 20, 2025

Probabilitic Modeling with Language Models

Sunday, September 14, 2025

Marketing Mix Model

Wednesday, September 10, 2025

Allegations of War Crimes and the Palestinian Genocide

Sunday, September 7, 2025

AI a bag of tricks

Thursday, September 4, 2025

ShinyLive ❤️ Mesa Tutorial

Tuesday, September 2, 2025

Langtalks Resources # 43

Thursday, April 3, 2025

Base line Morphology Model

Wednesday, April 2, 2025

Complex Lewis Signaling - The Research Questions

Wednesday, April 2, 2025

The roles of Partial pooling and mixed strategies in the Lewis signaling game

Tuesday, March 11, 2025

Emergent Languages

Tuesday, January 14, 2025

Planning in the Complex Lewis Game

Tuesday, January 14, 2025

The Referential Lewis Signaling Game

Tuesday, January 14, 2025

A garden of forking paths

Saturday, January 11, 2025

Complex Signals Questions

Monday, January 6, 2025

The Many Path To A Signaling System

Sunday, January 5, 2025

Off-Policy Learning

Saturday, January 4, 2025

Rethinking Signaling systems via the lens of compositionality

Thursday, January 2, 2025

Lewis Signaling Game for PettingZoo

Wednesday, January 1, 2025

Books, Courses Tools

Wednesday, January 1, 2025

Villeny pure and simple

Thursday, December 12, 2024

Misbehavior of Markets and Scaling in financial prices 1-4

Monday, December 2, 2024

Scaling in financial prices 4

Sunday, December 1, 2024

Scaling in financial prices 3

Saturday, November 30, 2024

Scaling in financial prices 2

Friday, November 29, 2024

Scaling in financial prices 1

Thursday, November 28, 2024

Vitter’s Algorithm

Friday, October 11, 2024

TL-DR rethinking 💭 topological alignment

Tuesday, October 1, 2024

LLM the good the bad and the ugly

Monday, September 30, 2024

LLM and the missing link

Saturday, September 28, 2024

NLP with RL

Friday, September 27, 2024

Deduction Evaluation

Thursday, September 26, 2024

Fine-tune llm for Style and Grammar advice.

Wednesday, September 25, 2024

Is compositionality overrated? The view from language emergence

Sunday, September 1, 2024

Six quick tips to improve modeling

Monday, August 26, 2024

Stumpy

Thursday, August 8, 2024

replay buffer questions

Tuesday, July 2, 2024

two ideas on generalization

Monday, July 1, 2024

Mesa & RL

Tuesday, June 25, 2024

zero inflated data

Sunday, June 23, 2024

readings in rl

Tuesday, June 18, 2024

Hyperparameter Optimization

Thursday, June 13, 2024

More Sugar please

Tuesday, June 11, 2024

Risk-constrained Markov decision processes

Tuesday, June 11, 2024

Evolutionary Games and Population Dynamics Summary

Sunday, May 12, 2024

Roth Erev learning in Lewis signaling games

Thursday, May 9, 2024

Signals Experiment

Tuesday, May 7, 2024

ad hoc complex signaling systems

Sunday, May 5, 2024

Shannon Game

Thursday, May 2, 2024

Urn models using Numpy

Thursday, May 2, 2024

RAD REPL

Wednesday, May 1, 2024

Mesa Lessons

Sunday, March 31, 2024

Sugar Scapes

Sunday, March 31, 2024

OCR building blocks

Thursday, March 28, 2024

A definition by Patrick Henry Winston

Sunday, March 3, 2024

OCR - Brain Dump

Sunday, February 25, 2024

Rhetoric NLP Tasks

Saturday, February 17, 2024

😁 Quarto 💖 Mermaid🧜 Mindmaps 🧠

Monday, February 12, 2024

Lewis Game from a Bayesian Perspective

Monday, February 12, 2024

The Great Migration

Tuesday, January 30, 2024

Post With Code

Sunday, January 28, 2024

SuperLearner

Wednesday, January 10, 2024

Engineering Reinforcement Learning Algorithms

Wednesday, January 10, 2024

Understanding Emergent Languages

Thursday, January 4, 2024

D3.js in in Quarto Observable

Tuesday, January 2, 2024

AutoGluon Cheetsheets

Wednesday, December 20, 2023

Summary: Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization

Thursday, June 1, 2023

Spark Tips

Thursday, June 1, 2023

MCMC algorithms

Saturday, April 22, 2023

Quarto loves pseudocode

Tuesday, April 11, 2023

Text2topic Leverage reviews data for multi-label topics classification in Booking.com

Tuesday, February 28, 2023

Validating NLP data and models

Tuesday, February 28, 2023

Transformations in Linguistic Representation

Wednesday, February 22, 2023

event generator

Thursday, February 16, 2023

OLS regression From Scratch

Wednesday, February 1, 2023

entropy for uncertainty quantification

Thursday, September 22, 2022

Robust Regression

Monday, September 12, 2022

Loss engineering and uncertainty for multi-task learning

Monday, September 12, 2022

Wikisym 2012

Tuesday, July 26, 2022

Set Up M1 MacBooks for DS & ML

Thursday, May 5, 2022

command line

Thursday, May 5, 2022

Meme bank

Thursday, December 30, 2021

Getting more from your agency ?

Monday, September 27, 2021

Excel 2019 for Marketing Statistics in pandas

Friday, September 24, 2021

Language models and explainability

Friday, September 24, 2021

Attention for sensor fusion

Friday, September 24, 2021

Advertising Models

Tuesday, September 14, 2021

Customer Lifetime Value - Pareto/NBD (BTYD) Model

Tuesday, September 14, 2021

Storytelling and other essentials

Thursday, September 2, 2021

WaveNet

Sunday, August 29, 2021

Python Graphs

Sunday, August 29, 2021

What is in a citation?

Sunday, August 29, 2021

Hackathon session link dumps & notes

Friday, August 13, 2021

Inlining Citations for Wikipedia articles

Friday, August 13, 2021

Transfer learning in NLP

Thursday, August 12, 2021

A type of Witness and an evolving Idiom

Wednesday, July 14, 2021

json-ld

Thursday, July 1, 2021

TensorFlow probability

Tuesday, June 1, 2021

Ebook Hacks

Saturday, May 29, 2021

Multilevel Models

Sunday, May 16, 2021

Q&A and the Winograd schemas

Tuesday, April 27, 2021

Automatic Summarization Task

Saturday, April 24, 2021

Bayesian agents

Wednesday, April 14, 2021

Modeling Events

Friday, April 9, 2021

10 Tips To Improve Your Workflow

Wednesday, April 7, 2021

numpy melt down

Sunday, November 29, 2020

Deep Learning Intuitions

Sunday, October 25, 2020

brace expansion

Friday, June 12, 2020

Pandas Productivity Challenge?

Wednesday, March 4, 2020

How to avoid cross site scripting (XSS) errors with the Jupyter local runtime for Colab

Thursday, February 20, 2020

Docker for data science

Sunday, November 24, 2019

Exploding and vanishing nodes.

Wednesday, July 31, 2019

text annotation with BRAT

Tuesday, January 16, 2018

A/B testing cost and risks?

Sunday, July 30, 2017

Travel checklist

Wednesday, December 14, 2016

HotJar Heat Map Analysis - Dr. David Darmanin

Wednesday, April 20, 2016

Using Competitive Analysis to Benchmark Your Marketing Efforts Ariel Rosenstein - Similar Web

Monday, April 20, 2015

Using Competitive Analysis to Benchmark Your Marketing Efforts - Ariel Rosenstein - Similar Web

Monday, April 20, 2015

Analytics Checklist

Saturday, February 7, 2015

life hacks

Friday, June 7, 2013

Text Mining With Python

Tuesday, November 29, 2011

Text Mining With R

Tuesday, November 29, 2011

Tidy Text Mining With R

Tuesday, November 29, 2011

Time management Tips

Thursday, August 11, 2011
No matching items
     
    • Copyright 2025, Oren Bochman
    • About

    Cookie Preferences