Training statistical machine translation systems on heterogenous corpora can result in suboptimal performance. Cuòng Hoang shows that domain-confused statistics may harm the performance of both word alignment and phrase-based models.
C. Hoàng: Latent Domain Models for Statistical Machine Translation.
Prof. K. Sima'an
Dr I.A. Titov
This event is open to the public.