AI a bag of tricks

My bag of tricks for the AI age

A collection of tools and techniques I use in the AI age.
AI
Tools
Techniques
Author

Oren Bochman

Published

Saturday, October 4, 2025

Keywords

AI, Tools, Techniques, Data Science, Machine Learning, Python, R, VS Code, Parquet, Quarto

TL;DR - Tool and Tricks for the AI age

In Jazz music, a “bag of tricks” refers to a collection of techniques, styles, and improvisational methods that musicians pick up to enhance their performances. These tricks can include various scales, chord voicings, rhythmic patterns, and expressive techniques that help create unique and engaging music.

Here is my bag of tricks for … AI age.

  • One problem with tools is that they have this long tail and after a while you may forget them.
  • Another problem is they can become outdated and better tools come along. At what point should we switch.
Fire Scraper Web Scraping Python
commercial tool for extracting articles from web pages.

Scrapy Web Scraping Python FOSS : is an open-source and collaborative web crawling framework for Python.

word2vec Embeddings Python FOSS
A group of related models that are used to produce word embeddings.
FastText Embeddings Python FOSS
A library for efficient learning of word representations and sentence classification.
GloVe Embeddings Python FOSS
A library for efficient learning of word representations and sentence classification.

spacy / NLP / Python / FOSS : A package for NLP in Python.

Prodi.gy / NLP / Data Annotation / Python / Commercial : A modern annotation tool for creating training data for machine learning models.

PySpark Big Data Python FOSS
is an open-source distributed computing system based on Map-Reduce paradigm.
Data Heroes
DataHeroes uses coresets – compact geometric representations that preserve key properties of the full dataset, to represent the full training dataset.
Outlines LLM Structured Data
Outlines guarantees structured outputs during generation — directly from any LLM.
Dynamic Linear Models Time Series R Bayesian
The R library for the eponymous Bayesian Time series analysis Framework using the Kalman Filter and Superposition of state space models.
Stumpy Time Series Python
STUMPY is a library for time series data mining, focusing on matrix profile algorithms.
Pydantic Data Validation Python
Pydantic is a data validation and settings management library using Python type annotations.
Data Wrangler for VS Code
this is a Smart ETL, is a code-centric data viewing and cleaning tool that is integrated into VS Code and VS Code Jupyter Notebooks.
Parquet
Parquet is a columnar storage file format optimized for use with big data processing frameworks.
VS Code
VS Code is a lightweight but powerful source code editor which runs on your desktop and is available for Windows, macOS, and Linux.
Quarto Blogging Data Science FOSS
Quarto is a markdown based document authoring system that is powered by Pandoc and is designed for data science and technical writing.
Data Science at the Command Line
This is a collection of command-line tools and techniques for data science, enabling efficient data manipulation, analysis, and visualization directly from the terminal.
Deepchecks Model Evaluation Python FOSS
Deepchecks is a library for testing and validating machine learning models.

Citation

BibTeX citation:
@online{bochman2025,
  author = {Bochman, Oren},
  title = {AI a Bag of Tricks},
  date = {2025-10-04},
  url = {https://orenbochman.github.io/posts/2025/2025-10-04-Bag-Of-Tricks/},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2025. “AI a Bag of Tricks.” October 4, 2025. https://orenbochman.github.io/posts/2025/2025-10-04-Bag-Of-Tricks/.