Restoring Hebrew Diacritics Without a Dictionary

Video 1: Presentation Video

Abstract

We demonstrate that it is feasible to accurately diacritize Hebrew script without any human-curated resources other than plain diacritized text. We present Nakdimon, a two-layer character-level LSTM, that performs on par with much more complicated curation-dependent systems, across a diverse array of modern Hebrew sources. The model is accompanied by a training set and a test set, collected from diverse sources. –(Gershuni and Pinter 2022)

Outline

The Paper

paper

References

Gershuni, Elazar, and Yuval Pinter. 2022. “Restoring Hebrew Diacritics Without a Dictionary.” In Findings of the Association for Computational Linguistics: NAACL 2022, 1010–18.

Citation

BibTeX citation:

@online{bochman2021,
  author = {Bochman, Oren},
  title = {Restoring {Hebrew} {Diacritics} {Without} a {Dictionary}},
  date = {2021-05-13},
  url = {https://orenbochman.github.io/notes-nlp/reviews/paper/2022-nakdimon/},
  langid = {en}
}

For attribution, please cite this work as:

Bochman, Oren. 2021. “Restoring Hebrew Diacritics Without a Dictionary.” May 13, 2021. https://orenbochman.github.io/notes-nlp/reviews/paper/2022-nakdimon/.