Guidance on the Nuts and Bolts of Self-service MT

Author: Unknown. Link to original: (English).
Tags: НУА Submitted by lyovkin_ilya 07.09.2016. Public material.

Translations of this material:

into Russian: Руководство по гайкам и болтам самообслуживания MT. 50% translated in draft.
Submitted for translation by lyovkin_ilya 07.09.2016


We have published articles, videos and reports covering self-service MT implementation for some time. In our reports, we have covered manager’s decision making process, technical implementation, data selection and cleaning, quality evaluation, as well as the outcomes of research into usage and requirements.

Videos provide valuable information on lessons learnt and best practices among early adopters.

Articles have covered a whole range of topics from overviews of the landscape, collaborative initiatives to address common issues, to summarize our own proofs of concepts of the viability of ‘push button’ MT customization, among many other things.

However, there are of course still many open questions on how to make improvements when the standard downloadable toolkits, and in particular, Moses, do not completely satisfy your needs. Our research on usage and requirements shows that many smaller firms are stumbling on the basics, such as installation, data preparation and integration. We are all very fortunate that TAUS members have already stepped up to these challenges and they are willing to freely share the solutions that they have developed.

Clearly, users want simple, fast and, ideally, free out-of-the-box solutions. Those already familiar with the landscape will be busy trying out some of the recent add-ons to standard toolkits. These aim to improve the usability and productivity of MT systems and they include:

Making the translation process faster and less memory consuming; for example, using MGIZA++: a multithread implementation of the GIZA++ aligner

Trying to increase the correlation between automated evaluation of translation results and human evaluation; for example, using Hjerson: a tool for automatic error classification

Helping users to assess the associated post-editing effort; for example, by using EvalTrans: a framework for human-assisted evaluation of MT quality, or Addicter: an MT error visualizer and labeler

Using alternates to GIZA++ for word alignment and translation decoders other than Moses to discover whether a standard workflow is the most efficient way to perform MT for their specific language pair and task; for example by using BIA or Berkeley Aligner; Ncode or Joshua

For many others who are only now dipping their toes in, it’s difficult to know where to begin in their search for optimizing their translation automation. It’s with this in mind that we launched the TAUS Tracker site (in beta) - a series of free directories of translation and language technologies. We encourage you to take a look, share this resource and let others know about your experience with the listed tools by leaving comments.

If you are in doubt about the value of trying out these and other alternatives to the current standard toolkits, please download this short TAUS Labs briefing note, which summarizes the potential benefits of three of the tools mentioned above.