In Jazz music, a “bag of tricks” refers to a collection of techniques, styles, and improvisational methods that musicians pick up to enhance their performances. These tricks can include various scales, chord voicings, rhythmic patterns, and expressive techniques that help create unique and engaging music.
Here is my bag of tricks for … AI age.
- One problem with tools is that they have this long tail and after a while you may forget them.
- Another problem is they can become outdated and better tools come along. At what point should we switch.
- Fire Scraper Web Scraping Python
- commercial tool for extracting articles from web pages.
Scrapy Web Scraping Python FOSS : is an open-source and collaborative web crawling framework for Python.
- word2vec Embeddings Python FOSS
- A group of related models that are used to produce word embeddings.
- FastText Embeddings Python FOSS
- A library for efficient learning of word representations and sentence classification.
- GloVe Embeddings Python FOSS
- A library for efficient learning of word representations and sentence classification.
spacy / NLP / Python / FOSS : A package for NLP in Python.
Prodi.gy / NLP / Data Annotation / Python / Commercial : A modern annotation tool for creating training data for machine learning models.
- PySpark Big Data Python FOSS
- is an open-source distributed computing system based on Map-Reduce paradigm.
- Data Heroes
-
DataHeroes uses
coresets
– compact geometric representations that preserve key properties of the full dataset, to represent the full training dataset.
- Outlines LLM Structured Data
- Outlines guarantees structured outputs during generation — directly from any LLM.
- Dynamic Linear Models Time Series R Bayesian
- The R library for the eponymous Bayesian Time series analysis Framework using the Kalman Filter and Superposition of state space models.
- Stumpy Time Series Python
- STUMPY is a library for time series data mining, focusing on matrix profile algorithms.
- Pydantic Data Validation Python
- Pydantic is a data validation and settings management library using Python type annotations.
- Data Wrangler for VS Code
- this is a Smart ETL, is a code-centric data viewing and cleaning tool that is integrated into VS Code and VS Code Jupyter Notebooks.
- Parquet
- Parquet is a columnar storage file format optimized for use with big data processing frameworks.
- VS Code
- VS Code is a lightweight but powerful source code editor which runs on your desktop and is available for Windows, macOS, and Linux.
- Quarto Blogging Data Science FOSS
- Quarto is a markdown based document authoring system that is powered by Pandoc and is designed for data science and technical writing.
- Data Science at the Command Line
- This is a collection of command-line tools and techniques for data science, enabling efficient data manipulation, analysis, and visualization directly from the terminal.
- Deepchecks Model Evaluation Python FOSS
- Deepchecks is a library for testing and validating machine learning models.
Citation
@online{bochman2025,
author = {Bochman, Oren},
title = {AI a Bag of Tricks},
date = {2025-10-04},
url = {https://orenbochman.github.io/posts/2025/2025-10-04-Bag-Of-Tricks/},
langid = {en}
}