It is well known that Translation Memories (TM) are widely used today in aiding human translators in their work. As a translation tool, they provide access to archives of previously translated text, helping reduce overall translation effort, time and cost while introducing greater consistency in corporate language translation.
It is a lesser known fact that Machine Translation (MT) can do the same thing with the additional benefit of NOT being restricted to archived matches only.
The process is called TM overlay. It allows MT to replicate the original TM (not only learn from it) along side the statistical model in a high-performance in-memory database. We will call it Machine Translation Memory (MTM). The MTM identifies exact matches from the archive in real-time. Any segment not found in the archive is then translated using statistical MT.
How does this change the translation process?
Today’s process separates archive matching from statistical MT:
New process consolidates automated translation into a single step:
1) It eliminates the use of approximate string matches (fuzzy matches) and replaces them with far more accurate statistical correlations, speeding up translation turnaround and improving translation quality.
2) The MTM can be updated in real-time with each committed segment by a translator, reviewer or editor. Every update to the MTM allows producing exact matches for recently committed segments even within the same project. This helps increase consistency and eliminate need for addressing repetitions.
3) MT becomes aware of every update to the MTM (unlike a separate TM) and can trigger an automated retraining of the statistical models, improving overall translation quality.
4) Finally the consolidation of the multi-stage automated translation process reduces cycles and eliminates integration points thus improving overall translation process efficiency.
Is a separate TM still required?
The answer is process dependent. If you decide to update the MTM on the fly, meaning translator-driven, you may want to keep a ‘clean’ TM on the side (can still be under the MT umbrella) to ensure only edited and reviewed segments end up in TM.
However, by tagging each segment in the MTM as either ‘dynamically updated’ or ‘approved’, the MTM can be used as the primary TM where for each new project, only ‘approved’ segments are replicated into the in-memory database. CAT tools can still access MTM using an API for extracting translation matches.
What about TM pricing models?
Segment-level usability estimations produced by MT allow keeping TM pricing models in place and even improve on them. A usability score attached to each segment allows identification of MTM-matched segments. MTM matches are tagged as “MTM exact match” or “Perfect”, while statistical ‘matches’ are tagged as either “Very Good”, “Acceptable” or “Poor” appropriately.
This helps translators to assess the level of effort required of them, make intelligent decisions on whether to accept, edit or reject each segment and maintain an alignment of effort and cost.
So what is next for TM?
TM and MT technologies are not only complimentary to each other but actually symbiotic. By combining TM within MT, a more efficient translation process is introduced enabling a greater level of translation accuracy and fluency for customized MT users.
As far as this writer is concerned, the right place for TM moving forward is as an integral part of MT…