17/01/2024
Quality time with MittagQI: developing what the market really needs
It’s Quality Time again. In a new episode of our series of expert interviews, Dr Juliane Schwab (Head of oneSuite) and Jasmin Nesbigall (Head of Terminology Management and MTPE) spoke to Marc Mittag from MittagQI. An insightful conversation about the translate5 translation system, the special features that set it apart from other CAT tools and the increasing integration of generative AI.
oneword provides the browser-based translation and localisation system oneSuite, which is based on the open source translation system translate5, a product from Mössingen-based MittagQI Quality Informatics. The head and namesake of the company is Marc Mittag, who has long specialised in web development in the field of language services to support the automation and configuration of translation processes.
Marc Mittag (MittagQI), Dr. Juliane Schwab (oneword GmbH) and Jasmin Nesbigall (oneword GmbH)
oneword (OW): Hi Marc. Thanks for taking the time to talk to us about translate5.
translate5 is a translation system, the ongoing development of which is jointly financed by a consortium currently made up of 15 language service providers – including ourselves. The mission is “From the translation industry for the translation industry”. What is the appeal of this business model for you?
Marc Mittag (MM): It wasn’t originally intended as a business model, but was born out of my impetus to help make the world a little bit better – even if that may sound trite. I come from a web development background and was aware of lots of great open source solutions in this field. In my first job at a language service provider, I realised that there wasn’t a single such solution in the translation sector.
At the same time, some of the systems we were using were not really intended for users. There was lots of good marketing and sales, but much of the content just didn’t work well. That was the impetus to want to develop a different system: for users by users, open source and without a focus on marketing or profit. Initially, I even wanted to make the system available completely free of charge, but the current consortium members convinced me that this wouldn’t work.
OW: The model also means that decisions about which functions are developed further ultimately lie with the consortium members. What requirements did you encounter and which of them perhaps surprised you?
MM: The main thing members want is stable and high-performance software that simply works. That’s why our focus is on good, fast support and rapid fixing of bugs – which certainly sets us apart from other tools.
In terms of features and new functions, I was most surprised by everything linked to InstantTranslate. This feature for direct online translations with the integration of machine translation, translation memory systems and terminology data was wholly the result of requirements stemming from the consortium that I would never have had or suspected. I was also somewhat surprised by the response from the market. But that’s precisely why the consortium and the motto “from the industry for the industry” work so well: we develop what the users really want and not something that could only be good in theory.
“We develop what users really want and not something that could only be good in theory.”
OW: InstantTranslate – we call the module oneTranslate – has struck a chord with the times as machine translation and AI have become an integral part of our industry. Is it mainly complete modules that are being driven forward by the consortium? Or is there a continual flow of minor input for improvements, for example for functions in quality assurance?
MM: Both. One of the first features that was requested back in 2013 was the addition of a pivot language (also known as a relay language) in the review process – after all, the review is quality assurance par excellence. Users wanted a bridge language, for example English, to be included for reviewers who do not understand the source language of the translation. In 2023, we extended this at oneword’s request so that the pivot language can now also be displayed in the Visual feature in the real-time preview in the layout.
OW: For us, these are features that serve our end customers and their needs. But you also have industrial customers who use the tool. How do they interact with the consortium in terms of feature requests and new ideas? There must be requirements that apply to only one of these two parties?
MM: The interaction between industrial companies and the consortium is getting really exciting. Industrial companies have been using translate5 since 2017. Until now, they have financed their feature requests themselves. Visual, for example, was something that arose from an industrial customer requirement and is now used by everyone. And that brings us to the interaction: once the feature existed, the consortium stepped in and drove a great deal of further development.
OW: And in purely technical and financial terms, if something is financed by industrial customers, what needs to happen for it to be available to everyone? Or is everything that is developed always available to everyone?
MM: As I said, I originally wanted to make everything we develop available to everyone free of charge. But then consortium members said that within their organisation they couldn’t justify spending a lot of money on a feature that all their competitors in the consortium – and even outside it – could then use for free. So we had to find other models, such as paid plug-ins. The features are therefore available as soon as they are ready for the market or as soon as a company decides to buy the paid plug-in.
The integration of ChatGPT is an exception to this rule. It was the first time that we at MittagQI wanted to move faster than we could get financial support from the consortium and the industrial companies. So we invested 300 development hours without any funding. We will release this function as a paid plug-in and in this way refinance our work. However, we also want to get the consortium and industrial customers on board for further development and not drive it forward on our own.
OW: Can you give us a little more insight into the work of the consortium? The consortium brings together 15 language service providers, i.e. potential competitors. How much agreement is there and how often do you have to step in as a mediator? Are there competitive situations or is everyone pulling in the same direction to advance the tool?
MM: As far as features are concerned, everyone is actually pulling in the same direction, or if they are pulling in different directions, it doesn’t do anyone any harm. One LSP may want a feature that someone else doesn’t need, but no one will mind if it is developed.
There are more problems in terms of capacities, i.e. the question of what is developed first. To guarantee our high quality, we only work with developers with permanent employment contracts. So we can’t scale up at will when the workload is high. The consortium does have a mechanism for determining what is assigned the highest priority. But I can’t have every feature developed by just anyone in my team. You’ll know what I mean, medical technology translators can’t suddenly be deployed in the automotive sector just because there is high demand for such work. And in addition to the features, there are also bug fixes, i.e. daily business, and there’s always a lot to juggle.
From what I see, there are only competitive situations in the consortium when it comes to perspective ideas – in other words, which topics we want to tackle together. When all 15 members come together, they tend to shut down for fear of revealing too much to the competition.
OW: Is the consortium approach actually a marked contrast to the fundamental idea behind open source? Open source means that the entire code is openly accessible and can be used, checked and further developed by anyone. Does this philosophy work for you?
MM: There’s a widespread misconception that open source is synonymous with being able to download everything free of charge. The Free Software Foundation, the world’s best-known open source advocates, say: “Charge as much money as you can for your open source software.” Open source means the right to be able to do certain things with the source code if I have access to it. It does not mean that I get this source code for free. Our paid plug-ins are also available under an open source licence – not only out of principle, but because we use components in them that oblige us to do so. But that doesn’t mean we have to give them out free of charge. Anyone who has the source code can read it and potentially correct or further develop it. However, only the core software without plug-ins can be downloaded free of charge.
OW: How much interest is there from companies or individuals in getting involved in developing the software?
MM: Surprisingly little. Perhaps that shows that we’re doing our job well.
There have already been some contributions from consortium members who have developed some good software, for example for number checking in quality assurance. In principle, we very much welcome this but it really doesn’t happen very often. This is certainly due to the structure of our customers. LSPs focus on language services and not on software development. Some are developing their own business software to manage their processes, but as far as translate5 is concerned, there is a consensus that they would prefer us to fulfil their requirements.
I see this as a major advantage of the model; you don’t have to do everything yourself, but you have enough say and influence to not just have to accept a finished product.
OW: Speaking of finished products, can you name three special features of translate5 that set it apart from other well-known translation tools?
MM: Firstly, the open source and community-driven model, which means that we do what our customers ask us to do – and that’s all we do. This also shapes the way we provide support. If you implement open source, you also need to have good support otherwise customers will look for it elsewhere.
One important difference is the licence model. We operate under an open source licence and don’t work out how many users are using the application or even – as is the case with other tools – how many words are flowing through the application. Most companies have a model with an unlimited number of users.
And thirdly, the features that represent real USPs, such as Visual, InstantTranslate and now also the new GPT feature. This allows us to offer things that other tools simply cannot currently do.
OW: With regard to the GPT feature, you said that it was an in-house development that wasn’t funded. What prompted you to do this? Were you simply interested, did you want to get ahead of the curve or did you already see a specific need?
MM: Actually, all three combined. In autumn 2022, I initially dismissed ChatGPT as hype, but then quickly realised that it was having a major impact on the industry. There is incredible potential for what can be done with generative AI. So we simply had to be quick off the mark.
In the first stage, we collaborated with an industrial company and then we were supported by a consortium member. Without their linguistic expertise, we wouldn’t have been able to carry out the development at all because a lot of linguistic know-how is required to develop the functions in the first place and, of course, to evaluate the results. The mix of industrial customers and the consortium again worked really well here.
“There is incredible potential for what can be done with generative AI. So we simply had to be quick off the mark.”
OW: Apart from the major factors that distinguish the tool from others, what else makes translate5 special, for example in terms of quality assurance?
MM: translate5 was originally a pure quality assurance tool with one main strength that it still retains today. It provides an application environment so that people can be involved in the quality assurance of translations without needing to be familiar with all the processes behind it. The Visual feature, for example, developed from this approach, and now also handles file formats such as videos and apps.
One particular detail is Quality Estimation, which was the first tool that we integrated three years ago for risk assessment. Quality Estimation means that a machine provides an estimate of how good the quality of the translated segment is. The algorithm behind it is ultimately the same as for machine translation. The machine learns on the basis of existing translations, then views the source and target text of a new project and gives a score per segment as to how high the risk is that the translation could be incorrect.
OW: This also reduces the effort involved in post-editing because segments with a very good score no longer need to be checked at all.
MM: Exactly.
OW: Both Quality Estimation and GPT fall under the hyped topic of artificial intelligence, which no-one in the translation sector can ignore any more. What else are you and your team planning to do in terms of AI?
MM: A lot! We want to press ahead with GPT integration not just as an additional translation source, but as a function in the entire application. As a translation engine, GPT is very easy to train so you can put together a specific engine in under 15 minutes – something that would otherwise take weeks and a huge amount of TM data. Quality estimation based on GPT is also highly plausible. However, it’s important to us that we take the next steps together with the consortium and our industrial customers, so it always depends on what requirements exist and what the majority want.
“We want to press ahead with GPT integration not just as an additional translation source, but as a function in the entire application.”
Terminology extraction with GPT as an upstream workflow step for translation is also possible, meaning that the terminology is created before translation and can then be directly incorporated into quality assurance of the translation.
And then we are planning a feature with the working title “InstantContent”. This will allow target language content to be generated on the basis of trained language resources. Because that’s where the industry is heading; content is no longer just translated, but created directly in the target language.
OW: We’re really looking forward to seeing how this develops and what we can get the consortium actively involved in. We’ve now come to our last section, “Short question, short answer”. We have five short questions for you and want you to use gut instinct to respond quickly. Here we go:
Which feature are you most looking forward to?
MM: GPT integration and improving Visual.
OW: What are your three highlights from 2023?
MM: Even though it sounds like I’m repeating myself: GPT and everything we’ve done with it so far. Secondly, the keyword “enterprise-ready”, i.e. the fact that large industrial companies have switched to translate5 and that the tool has once again proven itself in a completely new way. And thirdly, the fact that we were the first tool in the industry to become “cloud-native” and you can install translate5 on premises in the cloud yourself. The software can be distributed and scaled as required. And users will be able to tell, because individual processes simply run much faster.
OW: Please complete the following sentence for us: For me, software development is…
MM: …first and foremost actually art. Code must be beautiful. It also has to work, of course, but if it’s not beautiful, at some point it’s no longer maintainable. It’s the same with text: when I look at a text, I see letters and words, but it doesn’t look nice. The beauty only comes when you read and understand it. It’s the same with code, it’s about beauty in the structure.
OW: If you could conjure up one thing in the tool right now, what would it be?
MM: That we have seamlessly and smoothly integrated all content providers the world over out-of-the-box and done so free of charge, just as we did with Blackbird – a platform for connectors in the translation industry.
OW: Our last question: Where do you see translate5 in five years’ time?
MM: The tool is widely used in the sector, both by service providers and by industrial companies. It’s closely interlinked with AI and therefore offers a lot of useful support for everyday working life.
OW: We are very excited about everything the future has in store! Thank you very much for talking to us, Marc.
8 good reasons to choose oneword.
Learn more about what we do and what sets us apart from traditional translation agencies.
We explain 8 good reasons and more to choose oneword for a successful partnership.