Skip to main content
×
Home
    • Aa
    • Aa

Find the errors, get the better: Enhancing machine translation via word confidence estimation

  • NGOC-QUANG LUONG (a1), LAURENT BESACIER (a1) and BENJAMIN LECOUTEUX (a1)
Abstract
Abstract

This paper presents two novel ideas of improving the Machine Translation (MT) quality by applying the word-level quality prediction for the second pass of decoding. In this manner, the word scores estimated by word confidence estimation systems help to reconsider the MT hypotheses for selecting a better candidate rather than accepting the current sub-optimal one. In the first attempt, the selection scope is limited to the MT N-best list, in which our proposed re-ranking features are combined with those of the decoder for re-scoring. Then, the search space is enlarged over the entire search graph, storing many more hypotheses generated during the first pass of decoding. Over all paths containing words of the N-best list, we propose an algorithm to strengthen or weaken them depending on the estimated word quality. In both methods, the highest score candidate after the search becomes the official translation. The results obtained show that both approaches advance the MT quality over the one-pass baseline, and the search graph re-decoding achieves more gains (in BLEU score) than N-best List Re-ranking method.

Copyright
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

J. G. Camargo-de-Souza , J. González-Rubio , C. Buck , M. Turchi , and M. Negri 2014. Fbk-upv-uedin participation in the wmt14 quality estimation shared-task. In Proceedings of the 9th Workshop on Statistical Machine Translation, Baltimore, Maryland, USA.

K. Duh , and K. Kirchhoff , 2008. Beyond log-linear models: boosted minimum error rate training for n-best re-ranking. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (Short Papers), Columbus, Ohio, USA, pp. 3740.

J. Kreutzer , S. Schamoni , and S. Riezler 2015. QUality Estimation from ScraTCH (QUETCH): deep learning for word-level translation quality estimation. In Proceedings of the 10th Workshop on Statistical Machine Translation, Lisboa, Portugal. Association for Computational Linguistics.

K. Shah , V. Logacheva , G. Paetzold , F. Blain , D. Beck , F. Bougares , and L. Specia 2015. SHEF-NN: translation quality estimation with neural networks. In Proceedings of the 10th Workshop on Statistical Machine Translation, Lisboa, Portugal. Association for Computational Linguistics.

A. Tezcan , V. Hoste , B. Desmet , and L. Macken 2015. UGENT-LT3 SCATE system for machine translation quality estimation. In Proceedings of the 10th Workshop on Statistical Machine Translation, Lisboa, Portugal. Association for Computational Linguistics.

N. Ueffing , and H. Ney , 2007. Word-level confidence estimation for machine translation. Computational Linguistics 33 (1): 940.

G. Wisniewski , N. Pécheux , A. Allauzen , and F. Yvon 2014. Limsi submission for wmt’14 qe task. In Proceedings of the 9th Workshop on Statistical Machine Translation, Baltimore, Maryland, USA.

Y. Zhang , S. H. Almut , and V. Stephan 2006. Distributed language modeling for n-best list re-ranking. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), Sydney.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 5
Total number of PDF views: 28 *
Loading metrics...

Abstract views

Total abstract views: 123 *
Loading metrics...

* Views captured on Cambridge Core between 7th March 2017 - 28th May 2017. This data will be updated every 24 hours.