TraMOOC: Translation for Massive Open Online Courses
October 11th 2017 10:30 - 10:45
Massive open online courses have been growing rapidly in size and impact. TraMOOC develops high-quality translation of all types of text genre included in MOOCs from English into eleven European and BRIC languages (DE, IT, PT, EL, DU, CS, BG, CR, PL, RU, ZH) that are hard to translate into and have weak MT support. Phrase-based and syntax-based SMT, as well as NMT models have been developed for addressing language diversity and supporting the language-independent nature of the methodology. For a high quality, automatic translation approach and for adding value to existing infrastructure, extensive and advanced bootstrapping of new resources are performed. An innovative multimodal automatic and human evaluation schema further ensures translation quality. For human evaluation, an innovative, strict-access control, time- and cost-efficient crowdsourcing setup is used. Translation experts, domain experts and end users are also involved. Separate task mining applications are employed for implicit translation evaluation: (i) topic detection are applied to source and translated texts and the resulting entity lists are compared, leading to further qualitative and quantitative translation evaluation results; (ii) sentiment analysis performed on MOOC users’ blog posts reveal end user opinion/evaluation regarding translation quality. Results are combined into a feedback vector and used to refine parallel data and retrain translation models towards a more accurate second phase translation output. The project results are showcased and tested on the MOOC platforms and on the VideoLectures.net digital video lecture library.