26/04/2022
Translation quality
Synthetic content: aside from machine translation, what can artificial intelligence do and what does it need?
Companies can use synthetic texts and translations, images, videos and voices created by artificial intelligence to great effect with minimal effort. Nicole Sixdorf, our expert for translation management, explains what to look out for and how this works in a sensible, targeted and successful way.
We all know the scene: Keanu Reeves as Neo in the 90s blockbuster “Matrix”, who not only learns various martial arts at the touch of a button, but also creates all kinds of equipment and utensils from nothing more than programming code. What was once mere fantasy has now become (virtual) reality thanks to sophisticated algorithms and increasingly elaborate AI applications.
The matrix from the film is now synthetic content in digitally influenced everyday life: artificially created content that we encounter again and again in different ways. Often without being aware of it. The best-known example is certainly the so-called deepfakes, which have been haunting the media landscape with increasing frequency in recent years. Synthetic content, however, is far more diverse.
What is synthetic content?
The term synthetic content describes synthetic or artificial content created by artificial intelligence (AI) or learning algorithms – as in the “Matrix” we also say machines – in whose concrete creation process no human being is involved any more.
For example: when a text is to be translated from German into English, it is usually transferred from one language into the other by a translator by virtue of talent, cognitive and motor skills. This is the “traditional” human-created content – a human translation. If you copy the text into translation engines such as DeepL, an artificial intelligence or neural network creates a synthetic translation based on thousands of parameters and data points without a human being involved in the creation of the concrete translation – machine translation.
Synthetic content has limitless possibilities
Nowadays, machines can generate not only texts, but also images, videos, voices and music from an almost endless reservoir of data. At the push of a button, for example, the website This person does not exist creates images of people who, as the title says, simply do not exist. The advantage: personal image rights and data protection, for example for use in marketing brochures, do not have to be observed. The downside: the non-existent are virtually indistinguishable from real people and offer some potential for deception and abuse.
Examples of artificially generated people (Source: This person does not exist)
In most cases, all that is needed to create synthetic content is minimal input of short descriptions, so-called prompts, or corresponding parameters that provide the artificial intelligence with a path. Behind or along this path, however, are huge amounts of processed data and a sophisticated network of artificial neurons.
From generic artwork, images of people, animals or fantasy creations, to eloquent essays, newspaper articles or product descriptions, to voices from Alexa or Siri and generated videos in dozens of languages, there are almost no limits to the creation of synthetic content. However, the quality of the results also ranges from absurd and nonsensical to passable and deceptively genuine.
Language is the common denominator
For artificial intelligence (AI) to know what to create, communication between humans and AI must be established. So if we want Alexa to tell us what the weather will be like tomorrow so that we can decide whether we should get the rain jacket out of the cupboard, Alexa – the AI – must be able to process our speech.
The processing of language is called Natural Language Processing (NLP). “Natural language” in this case means human language, because computers are usually only language geniuses when it comes to programming languages.
NLP is divided into two main areas: One is Natural Language Understanding (NLU), which enables the machine to “understand” or process and interpret human language. And secondly, Natural Language Generation (NLG), i.e. the generation or reproduction of human language by AI. Examples of this are both the textual form of machine translation and the form of spoken language, for example in the home Amazon Echo.
A further application or prime example to illustrate the progress of NLP and synthetic content is the avocado chair, which is used in various forms by the neural network DALL-E was created. This impressive AI generates images from short text descriptions that users can freely create and, based on these, directly generates suitable images for articles, descriptions and other texts. To do this, however, the AI has to establish relationships between words, i.e. recognise what a noun, adjective or verb is and be able to interpret the individual words, because, of course, a machine has no idea what an “armchair” is unless we define this on the basis of data. This is exactly where NLP comes in.
Avocado armchair (source: DALL-E)
Synthetic content in companies
The areas of application are just as diverse as the types and possibilities of artificial content. Larger companies now use synthetic content, for example, in customer service, personalised marketing, e-learning, recruiting or, more generally, in traditional content creation for websites, webshops or even technical documentation.
The common goal of all synthetic content applications is to provide content more quickly, more personally and yet cost-effectively. And also in bulk and in a multitude of languages.
The best-known example is probably the computer-generated voices of Deutsche Bahn, which use text-to-speech to generate multilingual announcements at stations or on trains and buses.
No less present, but less commonly known, is its use in e-commerce. Here, more and more webshops have individual description texts for a portfolio of thousands of products and product variants generated and synthesised in a cost-effective, fast and high-quality manner. All that is needed is access to a few structured data points in a commercially available product information system.
AI now also enables individual texts to be (re-)written, for example, if the tone needs to be adapted for a specific target group. Instead of completely rewriting texts, they are “corrected” by programmes to make them more concise, more informal, more promotional or more factual, as required.
Even deepfakes are now being used to make a positive contribution by larger companies. Synthetic videos are used, for example, for the training or further education of employees or in customer service, to show a uniform “face” across worldwide company networks and at the same time to be able to interact in as personalised a way as possible. News portal Reuters uses AI and synthetic content applications for automated video reports, for example.
Conclusion: Is synthetic content our brave new world?
AI and synthetic content applications seem to be the solution to all production and efficiency issues. Because, hey, they provide texts and translations, images, videos and voices in all the languages of the world at the touch of a button and with minimal input. Unfortunately, that is not quite true.
Machines reach their limits again and again. All artificial intelligence needs human expertise. And any learning algorithm is only as good as the data on which it is based. The more specific requirements become, the faster the output quality dwindles or the investment costs rise.
In addition, humour, wordplay, colloquial language, emotion or logical sequences of thoughts are often places where synthetically produced texts, videos, sounds and images fail. Abstract content is also misunderstood because it requires a capacity for transfer that AI is not (yet) capable of.
As a result, the synthetic “text quality” is fundamentally different from that of a human-created text. After all, human performance is needed again to adapt content and ensure quality. For example, post-editing machine-generated content, regardless of the language, remains a necessary factor in obtaining truly high-quality and natural-sounding results. However, it is just as important to pre-edit scripts if they are going to be used for synthesised videos, for example. Voices in text-to-speech applications also often require touch-ups by qualified native speakers to sound as natural as they should.
As such, quality checks and continuous quality evaluation are and remain decisive factors in synthetic content as well, to be able to track and assess the development of content in the long term – and to ensure consistently high quality.
Do you want to provide high-quality content quickly, individually and yet cost-effectively, in large quantities and in different languages? Then talk to our experts about your specific requirements and goals. This is something we are very happy to help you with.
8 good reasons to choose oneword.
Learn more about what we do and what sets us apart from traditional translation agencies.
We explain 8 good reasons and more to choose oneword for a successful partnership.