Portuguese-Chinese Neural Machine Translation
Rodrigo Santos is an MA Student of Informatics Engineering at the University of Lisbon who is undertaking its masters work at the NLX-Natural Language and Speech Group under the supervision of António Branco and João Silva. He is presenting joint work with his supervisors and Deyi Xiong, from Tianjin University.
Machine Translation (MT) has been one of the classic AI tasks from the early days of the field. Portuguese and Chinese are languages with a very large number of native speakers, though this does not carry through to the amount of literature on their processing, or to the amount of resources available to be used, in particular when compared with English. In this paper, we address the feasibility of creating a MT system for Portuguese-Chinese, using only freely available resources, by experimenting with various approaches to pairing source and target parallel data during training. These approaches are (i) using a model for each source-target language pair, (ii) using an intermediate pivot language, and (iii) using a single model that can translate from any language seen in the source side to any language seen on the target side. We find approaches whose performance is higher than that of the strong baseline consisting of an MT service provided by an IT industry giant (Google) for the pair Portuguese-Chinese.