Will TM Survive MT?

Posted by: Udi Hershkovich, Chief Executive Officer

It is well known that Translation Memories (TM) are widely used today in aiding human translators in their work. As a translation tool, they provide access to archives of previously translated text, helping reduce overall translation effort, time and cost while introducing greater consistency in corporate language translation. 

It is a lesser known fact that Machine Translation (MT) can do the same thing with the additional benefit of NOT being restricted to archived matches only.

The process is called TM overlay. It allows MT to replicate the original TM (not only learn from it) along side the statistical model in a high-performance in-memory database. We will call it Machine Translation Memory (MTM). The MTM identifies exact matches from the archive in real-time. Any segment not found in the archive is then translated using statistical MT.

How does this change the translation process?

Today’s process separates archive matching from statistical MT:

TMMT

New process consolidates automated translation into a single step:

MTTM

1) It eliminates the use of approximate string matches (fuzzy matches) and replaces them with far more accurate statistical correlations, speeding up translation turnaround and improving translation quality.

2) The MTM can be updated in real-time with each committed segment by a translator, reviewer or editor. Every update to the MTM allows producing exact matches for recently committed segments even within the same project. This helps increase consistency and eliminate need for addressing repetitions.

3) MT becomes aware of every update to the MTM (unlike a separate TM) and can trigger an automated retraining of the statistical models, improving overall translation quality.

4) Finally the consolidation of the multi-stage automated translation process reduces cycles and eliminates integration points thus improving overall translation process efficiency.

TM_MT

Is a separate TM still required?

The answer is process dependent. If you decide to update the MTM on the fly, meaning translator-driven, you may want to keep a ‘clean’ TM on the side (can still be under the MT umbrella) to ensure only edited and reviewed segments end up in TM.

However, by tagging each segment in the MTM as either ‘dynamically updated’ or ‘approved’, the MTM can be used as the primary TM where for each new project, only ‘approved’ segments are replicated into the in-memory database. CAT tools can still access MTM using an API for extracting translation matches. 

What about TM pricing models?

Segment-level usability estimations produced by MT allow keeping TM pricing models in place and even improve on them. A usability score attached to each segment allows identification of MTM-matched segments. MTM matches are tagged as “MTM exact match” or “Perfect”, while statistical ‘matches’ are tagged as either “Very Good”, “Acceptable” or “Poor” appropriately.

This helps translators to assess the level of effort required of them, make intelligent decisions on whether to accept, edit or reject each segment and maintain an alignment of effort and cost.

So what is next for TM?

TM and MT technologies are not only complimentary to each other but actually symbiotic. By combining TM within MT, a more efficient translation process is introduced enabling a greater level of translation accuracy and fluency for customized MT users.

As far as this writer is concerned, the right place for TM moving forward is as an integral part of MT… 

3 Comments

  • Aidan Collins

    IMO, thankfully – both TM and Human Translators will remain an integral part of the Machine Translation process. It is not a zero sum situation.

    Reply
  • Linda Richardson

    Udi, great information in this blog! I am curious though, is the usability of MTM only for large volumes of data or can the average LSP’s client benefit from this technology?
    Thanks!

    Reply
  • Tatiana Gornostay

    No matter what – TM, MT, dictionaries, etc. – the user cares about his/her comfort and productivity ;-) As usual, the winner will be a smart integration of the above!

    Reply
  • Add a Comment

    Don't worry, we won't publish your email address.