| CLSP Homepage : Workshop Homepage | |
![]() | |
| Workshop 2005 | Wednesday, May 16, 2012 |
Statistical Machine Translation by Parsing
Machine translation (MT) is more important than ever. The quality of MT output has increased substantially in recent years, due to more sophisticated utilization of statistical learning methods and objective evaluation methods. However, statistical MT (SMT) systems often generate .word salad,. where the output may contain many correct words but in the wrong order, making it hard to understand. We propose to investigate a new approach to SMT that has models of word order at its core, in contrast to other syntax-based approaches. Models that integrate word order more directly promise to greatly improve the readability of translations. One flavor of this approach has already demonstrated BLEU scores 50-100% higher than with standard SMT software. At the same time, preliminary experiments show that the efficiency of our approach rivals that of state-of-the-art SMT systems. Our research will simultaneously focus on two language pairs -- English/French and English/Arabic -- thus demonstrating the generality of the approach. In addition to improved MT, goals of the workshop include training students to contribute to MT and NLP research for years to come, and a complete easy-to-use reference implementation for worldwide distribution.
| The Center for Language and Speech Processing The Johns Hopkins University 3400 North Charles Street, Barton Hall Baltimore, MD 21218 | |||||
| Telephone: (410) 516-4237 | Fax: (410) 516-5050 | E-mail: clsp@clsp.jhu.edu | |||