27/01/2025

Save on translation costs with translation memories

Almost regardless of the economic situation, making savings on translations is a topic that comes up time and again. The translation industry is under cost pressure because companies want to minimise their costs associated with multilingualism. Many make do with machine translation or enter new territory using large language models, ignoring quality in the process. You do not always have to use new technologies to save money. We show you the potential savings that can be achieved with translation memories, a well-established translation tool, how their use can be supported effectively and why high value should be placed on the language data they generate.

Even though translation memories (TMs) are a standard type of software in the translation industry, many companies have little direct experience with them. TMs are a component of CAT tools (computer-assisted translation). If companies do not have CAT tools themselves, they are used in the background by the translation service provider or by freelance translators. Translation memories store translated texts for a specific different language pair and divide them into segments. A segment usually comprises a complete sentence, but can also just be a single word, for example if it appears alone in a list. Each segment stored in the TM is also enriched with metadata, such as the creation date, and is available for future projects.

Using translation memories

Each new text to be translated is automatically compared with the translation memory for the relevant language direction. Identical segments – i.e. those that have already been translated – and segments that are similar to existing translations are identified. A segment that has already been translated is referred to as a 100% match and can be pre-translated in the new text. Then the translator only has to check it within the project, so it is calculated at a discount. Similar segments are between 1% and 15% different from an already translated segment and are referred to as fuzzy matches. In these cases, the original translation can be used as a basis and the part that is different can be adapted. Fuzzy matches also result in reduced costs, as the segment does not need to be fully translated from scratch.

Using a translation memory saves time and money, as texts that have already been translated do not have to be translated and calculated again. Reusing segments also creates consistency with previous texts, which can have a significant impact on the quality of the translation. Even if companies do not have TMs themselves, it should be possible to make the data stored for them available at any time by exporting it. We have already explained the issue of translation memory ownership rights in a blog post.

Using existing translations

With translation memories, content that has already been translated is reused. However, the content is usually checked again each time and therefore comes with a charge – albeit at a discount. If the TM content is older, terminology has been changed or the quality of the TM data is not assured, for example if you have changed service provider, it definitely makes sense to check the 100% matches. In other cases, however, a new review will not be necessary, for example because the original document was only translated a few weeks ago. If the client does not want previously translated segments to be checked or paid for again, there are two options: either only new text is assigned for translation or pre-translations from the TM are locked. There are advantages and disadvantages to both options and they should be weighed up with the service provider in advance.

Focus on new content

When translating only new parts of a text, for example an additional section of a long manual, existing content is not checked or charged for. In editorial and PIM systems, new blocks of text can usually be filtered out at the touch of a button thanks to modular text storage. These can then be exported as a file and transferred back into the existing document after translation. For Word or InDesign files, different versions can be compared so that changes or additions are easy to see. However, if the text sections to be translated are then compiled manually, in addition to the extra work involved, there is an increase in the potential for errors. This is because parts of a text or changes to a text can be overlooked during compilation or end up in the wrong place when they are inserted. There is an obvious risk here, especially if the person compiling the texts is not proficient in the relevant language or writing system.

The option of compiling only new parts of the text provides less context in the translation process and the segments to be translated may be more difficult to understand. Furthermore, it is not possible to know and match the style or wording of the rest of the document, which can lead to style changes and inconsistencies. This can be avoided if the entire document is supplied as a reference, in which the translator can then search for the passages to be translated. This can be automated from some editorial systems. For the sake of context and consistency, it always makes sense to provide the entire document for translation.

Locking 100% matches for editing

This is also the aim of the second option, in which the existing 100% matches from the translation memory are used, but are locked for editing in the CAT tool. The source text and the existing translation are displayed, but cannot be edited by the translator. The text thus appears in its overall context and the new texts to be translated can be correctly understood. However, locked segments are also not checked, which means that errors or outdated terminology in TM matches cannot be identified. This can lead to inconsistencies, particularly with regard to terminology. For example, if preferred terms have changed since they were first added to the terminology database, they will only be applied in new segments.

The translator has no way of immediately correcting any errors that they find in existing segments when they are skim-reading them to understand the text. They can only note necessary changes in comments or notes on delivery. Another potential disadvantage of locking segments relates to handling data. If the entire document is always sent over, this quickly means large file packages, especially if the files are from layout programs, which have to be packaged and possibly sent via a file exchange program.

Clean segmentation

Translation memories thrive on clean data. For segments to be reused, clean segmentation and correct translation are essential. This is because complete segments can only be used without hesitation in future translation projects if they have been translated correctly and have exactly the same content as the source-language segment. Our article on translation-oriented writing explains what can hamper meaningful segmentation. For example: changes to the layout, such as a manual line break, fragment a sentence and split it into two segments. Because word order differs between languages, this fragmentation can lead to different content in each segment, meaning that the source and target segments do not correspond with one another:

Save on translation costs with translation memories; Example for bad segmentation

The reusability of these segments is also low, as in a later project the same sentence might not contain a line break. There would then be no 100% match, although the sentence has already been translated.

So to optimise the use of TM content and the potential for savings that this brings, it makes sense to review text creation processes to ensure that they use translation-oriented writing. Correctly segmented texts also minimise queries, saving time and money in this process step too, because complete sentences are better understood and content can be clearly linked up.

Cleaning up translation memories

Translation memories also grow rapidly, as is the case in all processes in which data is created and stored. Segments are added with each new translation project, which can quickly amount to several thousand in one go if the document is a large one. This growth alone makes it sensible to regularly review the content. If unclean segments are added, for example for the reasons mentioned above, it is essential to clean them up. This is because fragmented segments and a lack of correspondence between the source and target languages render the stored segments worthless and, in the worst case, can lead to errors in subsequent translations.

In addition, depending on the settings, there may be duplicates when translations are saved, i.e. there are two or even more occurrences of the same content. These are not recognised as a 100% match in subsequent projects because no clear match can be assigned to the source text. Duplicates therefore lead to higher translation costs, as they are categorised and calculated as fuzzy matches. Changes to terminology may also require TM content to be cleaned up. All terms to be changed are searched for and corrected in the TM data so that they can be used again later as correct matches.

Regardless of whether the problem is pure data volume, unclean data or duplicated data, when using TMs, it makes sense to develop a clean-up routine or have potential for cleaning up the data checked as part of an analysis. This allows targeted steps to be taken to clean up the translation memory and keep the TM up to date. Our oneCleanup service, for example, starts right here, analysing the potential for cleaning up TM data and then enabling a semi-automated clean-up. After all, TMs provide valuable language data that can ensure consistency, speed up processes and save on costs. Businesses should never disregard this valuable asset.

Would you like to find out more about using translation memories and cleaning up language data with oneCleanup? Then contact us today.

8 good reasons to choose oneword.

Learn more about what we do and what sets us apart from traditional translation agencies.

We explain 8 good reasons and more to choose oneword for a successful partnership.

Request a quotation

    I agree that oneword GmbH may contact me and store the data that I provide.