Validating NLP data and models with deepchecks

NLP.IL

A recap of the NLP.IL talk by Nir Hutnik on validating NLP data and models, discussing common issues such as data and prediction drift, sample outliers, and error analysis, and demonstrating how to detect these issues using the deepchecks open source testing package.
nlp
nlp.il
data validation
model validation
tools
Author

Oren Bochman

Published

Tuesday, February 28, 2023

Modified

Monday, May 18, 2026

Keywords

NLP, data validation, model validation, deepchecks

Nir Hutnik - LinkedIn

Abstract:

NLP data, and unstructured data in general, is very hard to validate. Validating NLP data is a real challenge, as actions such as statistical analysis and segmentation, which are pretty straightforward on structured data, are not so easy to undertake. In this talk, we will look at common issues in NLP data and models, such as data and prediction drift, sample outliers and error analysis, discuss the ways they can impact our model performance, and show how we can detect these issues using the deepchecks open source testing package.

Speaker

Nir Hutnik

Slides

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

slide

Citation

BibTeX citation:
@online{bochman2023,
  author = {Bochman, Oren},
  title = {Validating {NLP} Data and Models with Deepchecks},
  date = {2023-02-28},
  url = {https://orenbochman.github.io/posts/2023/02-28-nlp-il-booking-meetup/NLP-IL-Booking Validating NLP.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2023. “Validating NLP Data and Models with Deepchecks.” February 28. https://orenbochman.github.io/posts/2023/02-28-nlp-il-booking-meetup/NLP-IL-Booking Validating NLP.html.