Human-machine interface: Artificial Intelligence applied to translation

Training translation AI systems with proprietary terminology allows for machine translation optimization


GearTranslations Internationalization Content marketing Translation SMBs


We have talked before about how today machine translation still requires human post-editing, in order to ensure the quality and consistency of results. Language is subjective, and large amounts of bilingual data are needed to train applications and IT systems.

Machine translation finds great difficulty when it comes to producing good results in domain specific translations. That is, general translations are possible, but results aren’t satisfactory when dealing with technical translations.

Nowadays there’s increased innovation in artificial intelligent systems applied to translation, and it’s aimed at more automation with better results.

Artificial Intelligence applied to translation

Current development is focused on the generation of personalized automated translation engines that allow machine-learning technology to be applied to proprietary terminology and expressions databases. This helps the machine learn how to improve the accuracy of technical and specific translations.

In such a way, automated translation evolves to domain specific translation; that is, translations of content related to a specific area of knowledge. The idea is for the machine to learn specific vocabulary from a particular area, perform analysis and term recognition, and manage proprietary or external terminology bases to obtain more accurate translations on content within specialty areas.

Custom MT

In the past two years, companies like Microsoft, Google o IBM have launched a new generation of neural machine translation engines with custom MT (custom machine translation).

These new systems –such as Microsoft Custom Translator, Google AutoML Translation or IBM Watson– have been designed to perform translations custom for use within a specific domain (with terminology bases).

This automated custom translation uses proprietary corporate content, or previous translations in a specific domain, in order to train artificial intelligent platforms to:

  • Enforce the use of a glossary: How to translate certain names, or word groups is determined, and a glossary or vocabulary list is created. When translating, the system identifies such words, consults the glossary and uses the approved translation for those words
  • Customizing a terminology base: In order to achieve a larger learning base, it’s possible to add to the system phrases or text fragments from specific industry documents. In this way it is possible to train the system to optimize and adapt to a particular specific domain.

It is true that results from this custom translation engines still need human post-editing and a quality check. However, editing times and workload for the editor are significantly reduced.

Training these systems requires a large amount of information and bilingual data. Most of the data available are languages paired with English. Hence the need to build databases with other language pairs in order to boost this technology.

Thereby, translations must be considered Corporate Data, and be protected and structured as assets. Although it is now possible to perform custom machine translation at a technological level, at a practical one it’s really complex for a Company to train its own proprietary system, due to the required amount of data, and the resulting costs of structuring them for technological use.

Gear Translations facilitates the path towards better specific domain translations identifying and organizing your key terms and phrases in a Terminology Base, and creating multilingual full phrase databases in your Translation Library. We continue to move forward together!

Thanks to Artificial Intelligence applied to translation, you can now optimize and accelerate your translation projects.