Translation Models

CMU CS11-737: Multilingual NLP

This week cover MT and seq2seq models, including language models that calculate the probability of text and conditional language models that generate text based on specifications. We will delve into calculating probabilities of a sentence using autoregressive models, implemented with recurrent neural networks, and how to generate sentences using methods such as sampling and argmax. We also introduce attention mechanisms and the transformer model, which uses self-attention and other techniques like positional encodings, layer normalization and specialized training schedules.
Attention
Multilingual NLP
NLP
Notes
seq2seq
NMT
Author

Oren Bochman

Published

Thursday, February 3, 2022