Interpretation, NCTA

INTERPRETING MACHINES (BESIDES US)

The first NCTA meeting of 2008 took place on February 9 and featured—in addition to our election results and news of ongoing projects—longtime NCTA member Hany Farag’s presentation on new developments in machine translation.

BY SARAH LLEWELLYN
Hany Farag at the Annual Meeting

NCTA Secretary Stafford Hemmer, standing in for the absent Vice President Yves Avérous, began the meeting with a series of announcements, including details of upcoming NCTA workshops, a call for volunteers to present future NCTA workshops and also to contribute to Translorial, and a reminder about the monthly happy hours that take place the last Monday of every month in San Francisco and Oakland.

Alison Dent announced the results of the recent (uncontested) election, and welcomed each of the new board members, who will begin two-year terms effective immediately. Dagmar Dolatschko will take over from Song White as treasurer; Paula Dieli will take over Naomi Baer’s position as membership director; Norma Kaminsky will be responsible for continuing education in place of the outgoing Mateo Rutherford; and Diane Montgomery will take on a new role of director of marketing. Stafford Hemmer will continue in his capacity as secretary. Stafford thanked each of the departing members of the board for their valuable and often inspirational contributions during their tenure.

The Interpreter Machine

The meeting’s featured presentation was given by long-time NCTA member and former board member Hany Farag. Hany works in the fields of language and technology and is a translator and state-certified Arabic interpreter, as well as a technologist specialized in automation and control systems.

Hany’s presentation focused on recent efforts in the development of an automated, real-time speech-to-speech translation device—an “interpreter machine”— under the auspices of DARPA, the U.S. government’s Defense Advanced Research Projects Agency. While machine translation in various guises has been around for some 50 years, the development of such a system was hastened by an urgent need for Arabic-language interpreters in Iraq after the 2003 U.S. invasion of that country.

Iraq: Facts and Challenges

One of the challenges facing the ground forces in Iraq was how to rebuild a nation of 20 million people, while having virtually no knowledge of the native language, Arabic. The number of interpreters needed— more than 5,000, based on U.S. troop deployments—was an unrealistic target, particularly given that in the whole of California there were, at most, 500 Arabic-language interpreters. And using local interpreters posed a variety of problems, not least of which was the reliability of their information for intelligence purposes. In response, DARPA instigated a project entitled Global Autonomous Language Exploitation (GALE) to develop an interpreter machine that could communicate spontaneously in real time in tactical—that is, war or battle—situations.

Competing to Succeed

Three teams of researchers were hired to develop systems: IBM, The Stanford Research Institute (SRI) and Bolt Beranek & Newman (BBN). Each year, their progress would be evaluated, and the worst-performing team could be eliminated—or, the program could be shut down entirely. At any time, up to 200 people have been working around the clock on this initiative: the largest language project in existence.

Due to the fact that the only existing, related technology was machine translation for text, the interpreter machine had to be developed using a series of building blocks. The first was ASR (Automatic Speech Recognition). Machine translation was the second component, involving the creation of a corpora, or body, of words in context to improve the translation. The third building block involved text-to-speech synthesis (TTS), which was already of exceptionally good quality.

By late 2006, two machines were ready for deployment in Iraq: IBM’s MASTOR and SRI’s IRAQCOMM, each using a different technology, and each having an accuracy level for text estimated to be around 75%. R&D is still in progress, with the goal of reaching 95% accuracy—comparable to a human interpreter—by 2010.

Hany concluded his presentation by suggesting that no one can stop the progress of technology, and that we need to embrace innovation by understanding it and contributing to it if we can. Researchers, after all, are not practicing interpreters!

After a brief Q&A session, NCTA presented Hany with a box of Valentine’s Day Joseph Schmidt chocolates, to thank him for his presentation.