Learning Multilingual Open-vocabulary Paraphrasing image

Learning Multilingual Open-vocabulary Paraphrasing

October 11th 2017 10:15 - 10:30
Room C

We will present a method for learning to paraphrase within the same language as well as between languages. Our method uses word and n-gram vector space representations that are learned from large raw text corpora and combined into long phrase representations. The key advantage of the approach is that it is capable of processing a search phrase outside of its vocabulary and come up with paraphrase suggestions that also have not been seen in the training texts. Finally, there are no out-of-vocabulary tokens, as byte-pair encoding is used. We will demonstrate two applications of the method: generating suggestions for alternative domain names and evaluating translation hypotheses.

Mark Fishel avatar
Mark Fishel

Head of the Natural Language Processing Research Group at the Institute of Computer Science, University of Tartu: https://www.ut.ee/en

The group does research as well as collaborates with industrial partners, with the strong points being machine translation and post-editing, information extract...


This website uses cookies. By continuing to browse you agree to this and Conferize's terms of service.