While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise. As a result, it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact NMT training. In this talk, we will first discuss how such fine-grained semantic divergences can be detected without supervision. Then, we will analyze their impact on Transformer models, and, finally, we will discuss how they can be integrated into Transformers as factors.
Eleftheria Briakou is a fourth-year Ph.D. student in the Department of Computer Science at the University of Maryland, College Park. Her work focuses on detecting differences in meaning across languages and explores how they question common assumptions related to using data when developing NLP technology.