Today, we’ll go into how machine learning drives machine translation. But we’ll do more than that.
We’ll go into the applications of machine translation, particularly its current problems, and how tech companies are slowly changing the face of the translation industry through innovations in machine translation.
Most importantly, we’ll see how translation companies are themselves at the forefront of this method of machine translation. There are approximately 7,111 languages in the world, according to Ethnologue, one of the most extensive catalogs of languages in the world. There are major languages, minority languages, and rare or endangered languages. What does this mean? Are we living in a cacophony?
Translation, an extension of language that allows others to connect and understand each other, is necessary in a world where there is so much language diversity. And translation is becoming more and more advanced, due to machine learning driving machine translation. Let’s dive in.
Credits: biancoblue of freepik
What Is Machine Translation?
Machine translation is the process by which a machine translates text automatically through automatic or instant translation. Generally speaking, machine translation, or MT, is a field of computational linguistics. Basically, it means software translates source texts in order to arrive at a new textual translation.
Machine learning drives machine translation. Machine translation or neural machine translation works on a machine learning system called artificial neural networks. These neural networks form the backbone of prediction — such as what becomes the basis of auto-correct and predictive text.
Machine learning functions in machine translation like this (in simple terms): a machine learns through learning data, fed to it by humans, and over time, it learns tasks, words, phrases, and robotic meanings. One by one, it regards the information, until it can learn better, over time.
So, both the machine can learn both its learning process and the information it’s being given, so if it’s fed French, for example, it can learn to translate it to English.
For anyone who’s ever used Google Translate, they’ll understand that the machine is still in the process of learning. Machine translation and machine learning are the related concepts behind Google Translate, which, to be fair, is at its heart just a robot.
The Problems of Machine Translation As It Stands Now
The problem of machine translation as it stands now, even with all the machine learning advancements of the past years, is built into its system — it’s that it’s still learning. That means that it won’t be as well-versed in grammatical structures as your college professor. It won’t be proven to be as contextual as a scientist. And, it won’t be as creative as the creatives you work with.
Here are the problems of machine translation:
It’s not as accurate as human translation when it comes to quality. That’s because although you can feed it basic grammar structures, it won’t be the next Ernest Hemingway. It still has problems with clauses and complex grammar. It can’t write poetry, which usually jumbles grammar expertly. And most especially, it won’t be able to translate a source text with different grammatical structures (such as Spanish) to a different language system (like English). This means human post-editing will have to occur, which I’ll talk about later.
Lack of Contextual Cues
Machines can’t receive the contextual clues that would normally be ingrained in a human being. Machines only know what they’re being fed through machine learning. It can’t know that a refrigerator is near a TV if it’s only fed information about the TV. This means it requires a lot more input information through machine learning in order to provide context clues. In machine translation, it means that it can’t detect contextual clues like slang.
Lack of Cultural Sensitivity
This brings us to the next point. Since machines are machines, they have no empathy or any “feelings” that makes them human. That also means that in machine translation, they’re going to be less culturally sensitive and will string sentences in translation that may be offensive to a particular culture.
Lack of Creativity
And finally, machine translation can’t be Einstein. They can’t linger on “thoughts” in the hopes of a brilliant idea. They’re fed information and they transform information based on a set of rules that they’ve learned. Machines aren’t creatives, yet. Due to the problems of machine translation, the translation industry is using machine translation post-editing, until a certain point when machines are capable of translating more accurately and more effectively through what they’ve learned.
What’s machine translation post-editing? Read more.
How Is The Translation Industry Accommodating Innovations in Machine Translation?
Machine translation is used by professional translation services with a combination of human supervision called post-editing. Machine translation post-editing is the means to correct some of the problems of machine translation as it stands now. Humans amend machine-generated translation, such as problems in grammar, incorrect context, and cultural insensitivity, to produce a more creative, more accurate final translation. This process is called post-editing.
Credits: biancoblue of freepik
As machine translation is becoming faster, cheaper, and more accurate through machine learning, machine translation post-editing is becoming a highly specialized field itself. Post-editors undergo rigorous training and practice in order to achieve high results. Machine translation post-editing accommodates innovations in machine learning by adjusting post-editing.
There’s light post-editing, full post-editing, and they’re adjusted based on the translation engine.
For highly accurate machine translation texts, light post-editing is optimal. For machine translation texts with many mistakes, full post-editing is used. Machine translation is also optimal for processing large amounts of data or documents because it can perform efficiently. For massive amounts of data, light post-editing is preferred. Check with your professional translation services which one is the right one for you.
Machine translation in web development
When creating web products, many entrepreneurs want to make them multi-lingual. This is done in order to make the web service more customized and accessible to a wider audience. While the basic version of the web product will of course be in English, you can get more engagement from users by providing them with the service in their language.
One of the most famous ways to do this is by adding Google's API based on machine translation. It's no surprise that Google's machine learning product is the most popular option. The company has been studying machine learning and machine translation since the middle of 2000 and is one of the founders of this direction. More specifically, many websites use Google's Translate API. Since 2006, this mechanism has been studying hundreds of thousands of documents written in different European languages. As of today, the mechanism has been proficient in over 100 languages.
During the installation, the technical details of which we will not go into, the Google's Translate API needs to be configured for only one translation, after which the mechanism will learn to do it on its own. So, you can set up your website in English and then translate it in French via API, after which Google Translate will do similar iterations on its own with, for example, Chinese and German. All this greatly speeds up the process of creating multilingual web products. Finally, it's worth mentioning the price: the cost of a Translate API is $20 per million characters.
How Are Tech Companies Incorporating Machine Translation?
Machine learning is being used to train machines to automate work and make it easier for humans. You can use machine learning to keep your business safe. You can use it for time series forecasting, as a specific example. In machine translation, as we’re learning, it’s being used to automate translation. Tech companies are slowly incorporating machine learning and machine translation into their methods. It’s being used everywhere: in government, in healthcare, finance, eCommerce and legal, and especially in software and technology.
You already know about Google Translate. Try it, and it’s accurate for the most part. It’s come a long way since its unveiling in April 2006, due to machine learning. It now has 500 million users daily worldwide, offering 103 languages, according to The Independent. As the Google translation engine is becoming more and more accurate through machine learning, it will only get better.
Facebook, for example, introduced M2M-100, a multilingual machine translation model (the first of its kind) that translates up to 100 languages. It’s open-sourced here. Before, translation engines used to rely on translating using English data. That means a French to Polish translation relied on translating into English in between. But due to machine learning, machine translation is getting much, much smarter. Who knows what the future has in store?
Ofer Tirosh is the CEO of Tomedes professional translation services, which offers machine translation post-editing for over 120 languages and 950+ languages.