the Human Language Project

Translations of this material:

into Russian: Проект "Человеческий язык".. 13% translated in draft.
Submitted for translation by tatiana090293 26.12.2013


It’s time for a big idea. It’s time for the Human Language Project. A hundred million Euro investment over the next five years in the Human Language Project will boost the language services sector to a hundred billion Euro plus industry, create the conditions for a truly open market in the European Union and spur growth in international trade and ecommerce.

Machine Translation (MT) has come a long way helping people with their day-to-day simple communications across language barriers. The Human Language Project will establish MT as the new lingua franca realizing effective and adequate communication across the eighty languages spoken in Europe in first instance and across a thousand or more languages spoken in the world in the next phase.

The Human Language Project is an open platform of language resources and tools, consisting of at least (and maybe more):

Fearless sharing of language and translation data (speech and text) in all languages and language pairs, not hindered by outdated copyright law. European legislators must modernize copyright regulations on translation data. (See TAUS article published in January 2013),

A library of translation, language and reordering models covering all languages and a wide scope of domains to help fast-track and fine-tune the development and customization of machine translation engines.

A translation quality evaluation platform to help assess, benchmark and predict the right translation quality for different content types and different purposes of communication.

A library of language tools – such as parsers, chunkers, lemmatizers, taggers – to assist service and technology providers to improve and customize their solutions.

Common translation web services API’s to ensure that all services and technologies work seamlessly together.

The Human Language Project is the right fit for the European Commission’s Horizon 2020 and Connecting European Facility (CEF) funding programs. It will bring together public and private interests and funding.


The Human Language Project is inspired by the Human Genome Project, a cross-continental business-government-academia collaborate effort aimed at uncovering the human DNA. The Human Genome Project started in 1990 and delivered its results 13 years later. According to analysts the project already generated 200 times more in revenue than it cost, not to mention the progress in human evolution. Similarly we envision that the Human Language Project will have an exponential positive effect on the language services sector and trade and business in general, while pushing the evolution of human civilization to a much higher level of understanding, education and discovery.

Ideas around the Human Language Project have already been circulated and discussed by Steven Abney and Steven Bird, two linguists who want to see “a corpus that will include all of the world’s languages, in a consistent structure that permits large-scale cross-linguistic processing, enabling the study of universal linguistics.” More practically ideas for the Human Language Project were brought forward at TAUS conferences in Paris (spring 2012) and Seattle (autumn 2012). (See the video section on the TAUS web site.) Building blocks for the Human Language Project exist in the form of various initiatives and actions started by TAUS and its members and several other organizations and institutions.

Tell others and let us know

With this pitch TAUS wants to stimulate the discussion and lobby for collaboration across business and research organizations to make the most out of the new European Calls: FP 8, Horizon 2020 and Connecting European Facility. Please circulate this “Big Idea” and leave your comments at the bottom of this article.