Pouvons-nous fournir des traductions plus qu’exactes ?
une traduction précise est la mission essentielle
d’un traducteur linguistique.
Tout traducteur professionnel doit
et tout échec
équivaut à une faute
professionnelle. Une traduction précise est également la référence
évaluer la traduction automatique. Toutefois, ne devrait-on pas
ligne de base ?
Si oui, qu’est-ce qui est « plus qu’exactes » ?
Pour répondre à ces questions, nous devons d’abord définir ce que nous entendons par traduction exacte. Pour traduire un texte avec exactitude, nous devons conserver la sémantique du document source. Tout d’abord, nous devons transmettre le sens des mots ou des expressions, et ce dans le contexte des phrases, des paragraphes et de l’ensemble du texte. Outre le choix des mots, il faut respecter l’orthographe correcte dans la langue cible. Ensuite, nous devons respecter scrupuleusement les règles de grammaire et de ponctuation. Le respect de ces deux principes permet d’obtenir une traduction adéquate qui sera utile dans la plupart du temps, un résultat parfois obtenu sur des textes simples non techniques par une traduction automatique basée sur l’IA, comme Google Translate ou DeepL.
suffisant ? Pouvez-vous attendre davantage d’un traducteur
professionnel ? Bien sûr que oui ! Et c’est même un
excellente traduction est plus qu’exacte.
En plus de véhiculer le
elle doit communiquer
le message comme
prévu ses auteurs.
Pour ce faire, le traducteur doit parfois prendre des décisions concernant le niveau de technicité à adopter. Ces choix sont particulièrement importants dans le domaine biomédical, où la granularité des concepts et leurs relations diffèrent selon les langues (bien que le traducteur y soit confronté dans la plupart des domaines techniques). Par exemple, il n’y a pas toujours de correspondance directe entre les terminologies anglaises et françaises pour la description des parties anatomiques ou des symptômes. Les médecins français ont également tendance à utiliser des termes plus techniques que les médecins britanniques lorsqu’ils s’adressent à leurs patients. Par conséquent, afin de conserver le même impact, un document source donné devra être traduit de manière légèrement différente si le public visé est, par exemple, un chirurgien censé reproduire une opération, un médecin qui doit comprendre une maladie, des patients à la recherche d’informations pour étayer des décisions thérapeutiques ou bien encore le grand public. On devra traduire « disease burden » en « charge de morbidité » dans un document épidémiologique, mais probablement en « impact de la maladie » dans une présentation marketing.
De tels choix techniques reposent sur l’expertise passée, ce pourquoi les traducteurs possèdent des domaines de prédilection et qu’ils se bonifient avec le temps comme du bon vin. Mais ils sont également le fruit de recherches spécifiques, menées spécifiquement pour chaque projet de traduction. Un bon exemple est la traduction des fiches de données de sécurité (le document décrivant les caractéristiques, les effets possibles sur la santé ainsi que les précautions à prendre avec un produit chimique ou un médicament). Les titres des rubriques comme le contenu sont codifiés et spécifiques à chaque pays. Connaître les deux langues suffira à communiquer le sens du texte, mais le résultat de la traduction ne sera pas un document valable. Pour cela, il faut se plonger dans les spécifications de ces fiches de sécurité, ce dans les langues d’origine et d’arrivée. C’est un des domaines où la traduction humaine ne peut encore, probablement pour un moment, être remplacé par la traduction automatique.
sens des mots, la sémantique, n’est cependant pas le seul facteur à
prendre en compte pour peaufiner une traduction. Le ton du texte et
l’idiome spécifique à utiliser (qu’il s’agisse d’un véritable
dialecte ou du jargon d’un
auront également une forte incidence sur la transmission d’un
message. Selon le type de document, la longueur des phrases, le
rythme et les choix
peuvent devoir être adaptés pour atteindre la population cible.
L’esthétique d’un texte, son accroche générale, est une pierre
angulaire du marketing. Et
que l’on traduise des brochures, des sites web ou… des publications
de recherche et des demandes de subvention !
la cerise sur le gâteau, qui différencie peut-être
d’un simple traducteur, est la correction du document source. Cette
démarche doit être entreprise
avec tact, et peut-être seulement après qu’un traducteur et un
client aient établi un certain niveau de confiance. Ces corrections
peuvent être de nature typographique
ou plus approfondies,
des corrections factuelles ou des conseils sur la façon
de communiquer le message.
Tout cela contribuera à une traduction plus qu’exacte. Et tout cela est, pour l’instant et pour encore longtemps, hors de portée des approches les plus avancées de traduction automatique.
Delivering an accurate translation is the core mission for a language translator. Any professional translator should achieve this, and any failure to do so is tantamount to professional negligence. Accurate translation is also the gold standard on which to assess automated translation. However, should this not be considered as the minimum? If so, what is “more than accurate”?
To answer those questions, we must first define what we mean by accurate translation. To translate a text accurately, we must conserve the semantics of the source document. Firstly, we must convey the meaning of the words or expressions, within the context of sentences, paragraphs, and the entire text. In addition to choosing the right words, this includes respecting the correct spelling in the target language. Secondly, we must follow the rules of grammar and punctuation scrupulously. Following these two principles will provide an adequate translation useful in most contexts, and is sometimes achieved by machine translation based on AI, such as Google Translate or DeepL on simple non-technical texts.
Is that sufficient? Can you expect more from a professional translator? Of course, you can. And you must!
An excellent translation is more than accurate. On top of conveying the meaning of the source, it should deliver the message as intended by its authors.
To do so, the translator must sometimes make decisions regarding the level of technicality to adopt. These choices are particularly important in the biomedical domain, where the granularity of concepts and their relationships differ between languages (although the translator will face them in most technical domains). For instance, there is not always a one-to-one mapping between the English and French descriptions of anatomical parts or symptoms. French doctors also tend to use more technical terms when talking to patients than British doctors. Therefore, to conserve the same impact, a given source document will have to be translated slightly differently if the intended audience is, e.g., a surgeon who is supposed to reproduce a procedure, a physician who needs to understand a condition, patients looking for information underpinning therapeutic decisions, or the general public. “Disease burden” should be translated into “charge de morbidité” in an epidemiological document, but probably into “impact de la maladie” in a marketing presentation.
Such technical choices rely on past expertise, which is why translators have specialities and why they become better with time like good wine. But they also emerge from dedicated research, conducted for each translation project. A good example is the translation of safety data sheets (the document describing the characteristics, possible health effects and precautions to be taken with a chemical compound or a drug). Both the headings and the contents are coded and country-specific. Knowledge of both languages will be sufficient to communicate the meaning of the text, but the result of the translation will not be a valid document. To do this, one must read the specifications of such safety data sheets both in the source and target languages. This is one of the areas where human translation cannot yet, probably for a while, be replaced by machine translation.
The meaning of words, the semantics, is not the only factor to take into account when polishing a translation, though. The tone of the text and the specific dialect to use (whether actual language or specialist circle’s jargon) will also strongly affect the delivery of a message. Depending on the type of document, the length of sentences, the rhythm, and the punctuation might need tuning to reach the target population. The aesthetic of a text, its general catchiness, is a cornerstone of marketing. And so, whether one translates brochures, websites, or… research publications and grant applications!
Finally, the cherry on the cake, which differentiates perhaps a specialist linguist from a mere translator, is the correction of the source document. This move is something that must be done tactfully, and perhaps solely after a translator and client have established some level of trust. Such corrections might be of proofreading nature (corrections of typos) or more profound, including factual corrections or advice on delivery.
All this will contribute to a more than accurate translation. And all this is, currently and for the foreseeable future, out of reach of the most advanced Machine Translation approaches.
In scientific texts, less is often more. Less figures and tables mean more clarity; Less experiments and results mean more impact. This might seem counter-intuitive since more information should always be better, right? Moreover, whether as preliminary data in a grant application or as results in a research paper, we all want to describe all the great experiments we ran, the clever analyses we came with, and the conclusions we derived. However, we also want – and need – to convey excellence.
Except for truly groundbreaking research papers, where a single result matters, overshadowing everything else, the final impact of a paper or a grant application on the reader will reflect the average quality of every independent result. If you performed three or four excellent experiments, and they are sufficient to demonstrate your point, every additional result of less novelty or perfection will decrease the average final impact. Here are five points you should reflect on when writing a scientific text.
1) Put yourself in your reader’s shoes
When producing any kind of material for public consumption, whether text or other types of documents, in science or any other field, we should never forget that the content should be geared toward the audience, not ourselves. As such, when writing an article or a grant application, we should always keep the potential reader in mind.
What is interesting for you is not necessarily interesting for them
You do not want to bore them stiff
You do not want to make important facts or conclusions hard to find
You want them to remember the main message
You want them to remember the WOW feeling they had when reading
You do not want them to think meh at any time, about any result
2) Do not describe the entire journey that led to the final set of results.
Imagine you ordered a wedding cake. The baker experienced three mishaps before getting it well. The first mix was wrong, and was not baked properly; another one was overcooked; the design of the third one failed. Do you expect the baker to deliver all four attempts on the wedding day or only the fourth one?
Each scientific text tells a story and unfolds along a storyline. This is necessary to bring the reader to the conclusions we want to share. However, this storyline is a logical construct, built to make the point clearer and easier to understand. It does not need to be the actual story, as it happened in the lab. There is no need to describe all the false starts, the dead ends, the mishaps, all the iterations with optimization (see below). Just tell the readers what you found, using the final or best experiments you performed.
3) Do not clutter the main body with negative results
Negative results are important, and we should not hide them. However, there is no need to put them in the main body of a text, except if they are revealing new insights. The overall message of a paper or grant application should always be positive, optimistic and forward-looking. If you want to report negative results, or warn others of dead ends, to spare their time, energy, and expenses, why not put these results in supplementary materials, on your website or deposit them in the relevant public database?
4) Do not clutter the text with sup-par, or trivial results
There is no need to explore exhaustively a question in the main text of a research paper. You should choose your main message, and build-up the best case for it, using only what is necessary to demonstrate your point. Yes, I am sure there are many other interesting aspects worth presenting and discussing in your experiments or your datasets. However, by devoted too much space to those secondary questions, you will dilute the primary conclusion, and make it more difficult to identify, and less impactful.
5) Do not load the main text with set-up procedures
You should mention the tests you performed and the validation procedures you put in place. This is important. But is it crucial to show all of them in the main body of your text? You probably ran dozens of experiments to find the right dose for your drug or marker, to optimize your buffers or culture medium. This took lots of effort and time. But is it as important for the reader as the final dose or culture medium? After all, like all professionals, we assume that you performed due diligence. Would you list all the search expressions you used in PubMed to perform the necessary bibliographic search during the project?
Remember, the people you want to impress most are editors and reviewers (for a paper) and members of grant panels (for an application). These people are often senior scientists, which means they have a limited amount of time available for each text. Furthermore, they are not always technical experts on the very question tackled in each of these texts (whether this is a pathological or desirable feature of modern research assessment is beyond the scope of this post). Most of them will only read the main meat, and probably quickly so. If you want to provide background information, provide them in supplementary materials or on a website.
In conclusion, in order to maximize impact, build a story as long as necessary and as short as possible. Remember: The perceived excellence of a piece of work is the average of the excellence of each of its components.
Machine Translation (MT) is one of the most discussed topics in the world of translators at the moment (on par with collapsing fees). Most of the arguments revolve around either its usefulness or the threat it poses to the professional human translators. We briefly touched on it within a previous post, but we would like to go a bit deeper here and provide some ideas about making the most of MT within the current translation workflow.
What is MT?
Wikipedia tells us that Machine translation is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another (warning, this Wikipedia page is quite outdated, as evidenced by the tiny mention of the neural network-based approach). Within the world of translation, this means the automatic translation of a piece of text by software that analyzes the source, without human intervention. This is different (and complementary) from systems based on translation memories.
This post is not a technical essay on the inner workings of MT, and we are not going to explain how the translation is actually done. Many approaches were proposed over the years, with increasing success. However, the paradigmatic change happened – as in many other domains – when people started to use “deep learning“, i.e. using cascades of artificial neural networks trained on a huge amount of data (for more technical information you can read Google’s Neural Machine Translation (NMT) paper in arXiv). Suddenly, one could actually copy-paste an e-mail or a webpage in a translation tool and understand what it was about. Sure, the result is not perfect. Let’s be frank, it is often quite bad and sometimes funny. But it is understandable, and more or less looks like what a human with an intermediate level in a foreign language could produce when translating a text about a topic they know nothing about. And the spelling and grammar are better than many of the e-mails, text messages and Facebook posts we are all daily subjected to. The latest massive improvement came with the DeepL system, training the network using the Linguee database of existing translations.
How does the professional translation world work?
In order to understand the disruption brought by MT, it is useful to recapitulate how a large part of the professional translation is organized. There are exceptions to what we describe below, fields of translation where people interact differently, such as companies with embedded translation offices, authors dealing directly with their translators, etc. We are not concerned by these, although MT has presumably a large impact there as well. First of all, there are three different jobs involved in the production of a translated document: 1) Translation per se; 2) Editing (for which the source document is needed), where one checks that the translation is accurate, all the requirements followed (e.g. no translation of person and product names), and 3) Proofreading (for which the source document is not needed), where one checks spelling, grammar, punctuation, etc. This is the so-called TEP workflow.
Typically, when someone, the end client, is in need of a translation, they will either contact a translation company or will post a job advert on one of the many possible websites, either non-specialised – such as Upwork or Freelancer.com – or specialized in translation – such as TranslatorsBase or TranslatorCafé. The companies can be real translation companies, performing in-house translation, or agencies, outsourcing the work. In most cases though, some outsourcing will be involved since very few companies have enough employees to cover all language pairs and expertise in all fields. Such outsourcing will be done through the company’s own network of freelancers, via professional platforms such as ProZ or using the sites mentioned above. Now, sometimes, the outsourcing process does not stop here, and a cascade of subcontracting unfolds, with decreasing fees at each step of the ladder. Unfortunately, as the fees decrease, so does the quality of the translation. This is why a revision step is put in place by the outsourcers. This can be just a proofreading exercise, fixing spelling, punctuation and the occasional grammar issue. Or it can turn into a heavier editing task, correcting translation mistakes. In the case of an outsourcing cascade, this can effectively become a retranslation.
How is MT affecting the translation pipeline
Before the advent of NMT, MT produced a text so bad, that it took a professional translator longer to fix it than retranslating from scratch. A machine-translated text was also immediately obvious, even when compared to bad human translations. All that has now changed. The quality of the produced translations increased dramatically (at least in certain cases. We discuss this in the next section) and large amounts of text can be translated very very quickly. While the free online versions generally limit every single translation to a few thousand characters, one can extend that via APIs (with or without fees, see for instance the R package deeplr).
This triggered two consequences, one ethical, one unethical, but both unfortunate. The first consequence is that some agencies think they can stop outsourcing the human translation part of a job and only pay for the revision one. The second consequence is that some freelancers pretend to translate themselves while they just use MT and a superficial revision. To be honest, in the latter case we are generally at the bottom of the subcontracting cascade, and the human translation would be quite bad anyway. In both cases, the result is a text that requires editing rather than proofreading. In the first case, agencies are honest and openly admit the fact, offering jobs of Machine Translation Post Editing (MTPE). But, and this is the crux of the problem, in both cases, the rate offered is at the level of proofreading rather than editing.
Improved MT also brought another change to the working practices of a professional translator. Many translators use Computer Aided Translation tools. Typically, such a tool divides the source text into segments, that are translated separately. Those tools now provide access to MT engines to provide suggestions for segment translations, as an alternative to Translation Memories (even if one could argue that DeepL is somehow linked to an uber TM, in the form of the Linguee database).
Understandably, the world of professional translation has been shaken by the sudden rise of NMT. In a couple of years, what was seen as a promising field of research became a game-changer. The reaction in such situations is always the same. It broadly follows the Five Stages of Grief. Because of the past history of the field, most translators went through the denial period. Many are still stuck there. Using the cases where MT performs badly – albeit not worse than a casual translator not doing their homework – as evidence, such people reject its relevance entirely. A portion of the community moved on the bargaining phase (trying to avoid or compete with MT), and some are even in the acceptance phase. However, a very vocal part of the community is currently in the anger phase. In some sense, they are similar to the Luddites who refused industrialization for fear that it would suppress their jobs. However, since they cannot break the MT engines, they turn their anger towards the translators using it. They are mistaken in exactly the same way as the 19th-century Luddites. They fear that the change of paradigm will remove the need for skilled workers and replace them with unskilled cheap ones. While exactly the opposite will happen, as it did a few centuries ago when automation created highly skilled jobs and removed the lowly paid manual ones. The segment of the translation community that will be the most affected by MT is the domain of non-technical, low quality, translation, while the skills of specialised human translators will be more recognized than they were when lost is an ocean of mediocre translators. Which brings us to the strengths and weaknesses of MT.
How good is MT?
So, machine translation improved tremendously, but how good is it for practical purposes? Sure, we all came across funny translations, and we can all do with a good laugh. However, for simple texts, the result is OK. DeepL’s translations to French of the following sentences is almost perfect: “The sky is grey. It is likely to rain”, “Postman Pat’s truck is red”, “The Luddites were a secret oath-based organization”, “Jeremy Corbyn is the leader of the labour party”. In the case of the first sentence, DeepL actually chooses a correct but suboptimal translation (Il est probable qu’ilpleuve). However, it picks the right one (Il vaprobablementpleuvoir) if we add a double quote at the end, which reveals one of the problems specific to its approach, that is oversensitivity to local context in existing translations. That said, Google Translate always picks the suboptimal solution.
This suggests a range of situations were MT could be used: Everyday’s discourse, children stories, factual descriptions, and news. What have those situations in common? The language is simple, and must be understood by everyone. These are “layperson translations”.
Now, by contrast, MT fails with highly specialised and technical documents, when the language requires a pre-existing particular knowledge from the reader, not shared by the entire population. Why is that? Because MT cannot cope with several situations, including the following:
When a word has several widely different meanings, and the source text does not use the most frequent one. For instance, in the ecclesiastic world, the French word “coule” designs a garment worn by monks. Now, MT will always believe “coule” is a verb meaning either some liquid moving from up to down, or something that get submerged by water, The proposed translations will be flow, run, pour, sink, cast (if what is flowing is metal or cement), stream, trickle, or even founder. It will never be cowl.
Not the same word or expression in different languages. Here we find the famous “il pleutcommevachequipisse” translated into “it rains cats and dogs”. Same underlying meaning, totally different expression. In general, all such imaged expressions tend to be translated literally by MT, resulting in completely meaningless sentences.
Meronymy/Holonymy, that is when the word used in a language represents part of the thing which the equivalent word in another language represents. I am not talking about synecdoche here, that is a stylistic figure which uses the part for the whole or the other way around.
Hyponymy and hypernymy, that is when a word in a language represents a generalization of the thing represented by the word in the other language. For instance, “seagull” is a layperson English word representing a subset of the family Laridae. In French, there is no such layperson term. Instead one will use either “goéland” representing the genus Larus which are big birds, or “mouette” representing several genera of the subfamily Larinae which are small birds. MT has noway to know which one the author of the source text meant (even if the previous sentence clarified the issue).
Complex relationships. In English, the temporal bone of the skull is separated into parts coming from different embryological origins (the squamous, petrous and tympanic bones). In French, the temporal bone is separated into regions of the adult structure, the “écaille”, “rocher”, and “mastoid”. It is impossible to translate one into the other. One has to reconstruct the entire description.
Context-dependent translation. MT typically focused on a word and its immediate surrounding. For instance, a human translator will understand that in the following sentences “La fille regardalesjouetsqu’onluiavaitoffert. Son ballon était bleu et son vélo rouge”, the ball and the bike are the girl’s ones. But MT cannot determine that. Both GT and DeepL translate it into: “The girl looked at the toys that had been given to her. His balloon was blue and his bike red.” (which by the way is a great example of unintended but real sexism).
I am certain there are other areas where MT performs unevenly or badly (for instance when it comes to household names, slang, etc.)
Among the other issues presented by MT are two problems that mirror each other. Since the MT engines have no memory of the entire text, the same word can be translated differently in different parts. Sometimes it does not matter, as in “stream” and “trickle” in the example above. Sometimes it does, if we get sometimes “stream” and sometimes “cowl”! Conversely, because MT engines were built on a given training set, they tend to produce texts that are boring in terms of vocabulary and “robotic” in terms of style. To be fair, this is much less of a problem with Deepl than with GT. Also, the problem is worse with translation memories, so MT might even be an improvement here.
Two ways of using MT in professional translation
At the heart of the debate and disagreement around MT in the professional translation setting lies a lack of clarity on the way it is and/or should be used. At the moment, there are two very different ways of using MT for translation: 1) using MT to perform the whole translation, and ask third parties to review the results. 2) using MT as part of a piece of the toolkit to perform translations, for instance, to provide starting points or alternatives for segments, in parallel to translation memories.
Many agencies, or publishers, think MT is ready for 1), while it is not. Let’s be really really clear here: MT is not the key to automatically – and cheaply – translate corpora of texts, either articles or books, etc.
Furthermore, reviewing translations performed that way is extremely difficult. It is by no mean a proofreading exercise, but rather an editing exercise. We had to edit large texts which comprised parts translated by MT and parts translated by a human who clearly was not a native of the target language. Both types were difficult to edit. However, there was one crucial difference: While the human-translated parts presented a horrendous style and many grammatical mistakes, the MT parts presented WRONG translations. In most cases, this is much worse. For instance, in the biomedical domain, tiny misunderstanding might lead to dreadful consequences.
Conversely, many professional translators think or claim that MT is not ready for 2), and cut themselves from a very useful tool. We wholeheartedly adopted 2). We think there is much improvement needed, and it is possible (see below). We believe translators, like any professionals, need to take control of their tools. When a farmer works out their field, they use various technologies. But one rarely sees some third parties, completely unaware of what was done to the ground and how it was done, coming and evaluating the work. They just buy the product. We think MT should be used by translators, not blindly, but in a controlled manner. Then, we will be able to learn from it, but also to help it grow to become an even more useful tool.
How to use MT efficiently
Use MT on a segment per segment basis rather than for the whole text (the definition of what makes a segment is let to the imagination or the preferences of the reader/translator).
Never accepts a proposed translation blindly. Check all the important words, as well as tenses and accords.
Make full use of the alternatives provided for instance by DeepL. The proposed choice is statistical, but often the right or more accurate one is within the first 3-5 alternatives.
Once a significant chunk of text is translated, re-read in its entirety to make the style more homogeneous and reduce repetitions. To be fair, this is not specific for MT, and should always be done.
back-translate the text from the target to the source language, in order to spot possible ambiguities or mistranslations.
What do you think? Are you using Machine Translation at the moment? Which systems? How?
Many people start in the translation business without a corresponding professional training. This is absolutely fine, and it is in fact a good way of using one’s language skills acquired either during a professional activity or a travelling life. However, as amateurs, they probably all tend to make the same mistakes. Here we list a few of them.
1) Believing that a translation job is just … translating
A translation job is much more than converting a text from a source language to a target language. Glossaries and a bit of grammar polishing would almost be sufficient for that. However, a translator must convey the “content” of the source document. That involves of course translating the words. But it also, and foremost, involves producing a text that carries the same message. And to do so requires to understand what the text is about, in details and with all its subtleties. This is why all translators have their specialities, and although most translators can do an OK job with any text in their paired languages, they really excel only within a few niches.
Conveying the proper meaning is sometimes at odd with keeping to a strict translation of the words themselves. Depending on the domain covered, one wants to massage the text to make it more readable and respect the form of the source text. With the exception of legal documents – where one must absolutely stick to the original, even if the result seems quite heavy – some sentence restructuring and expression switching is needed to make the result more palatable, and also truly equivalent in the target language. Finally, in the artistic domain, one wants to respect the style of the original, terse or verbose, dull or vivid, mainstream or abstruse. Lovecraft did not write like Stephen King despite hovering in the same literature space.
2) Starting the translation immediately
In order to translate a text accurately, we cannot start the work straight away. We must read the entire text beforehand, to make sure we understand what it is about, have an idea of the specialized knowledge we might need to acquire, and what was the goal of the authors. Such a preliminary read will only marginally increase the time spent on a text. Or at least it should, otherwise we are probably not spending enough time on the job! Reading a 100 000 words book before starting the translation might seem daunting, but the required time is still far less than what we will spend accurately translating those 100 000 words. And the gain down the line in terms of translation speed and accuracy largely makes up for the extra effort. During this initial read, we should make notes of anything we do not immediately get, any word or expression we did not come across in the past, and make sure we do fully understand it.
3) Trusting machine translation
Machine translation has seen astounding progress in the past few years. Software such as the Google Neural Machine Translation and (even more) DeepL , really transformed the activity to a point that, in many cases, the result really sounds like it has been produced by a native speaker, but is also better than a translation made by a casual translator, i.e. someone who would make most of the errors listed here … (By the way, this makes even more pathetic the ridiculous translations used in some places such as Stansted airport. It beggars belief that nowadays people produced voice announcements that barely make sense, and even check-in machines that speak some nonsense languages using random words assembled in sentences with no grammar whatsoever).
However, machine translation is still mostly good for straight texts, without nuance, technical jargon, and stylistic oddities. It is still too much based on word for word translation, or translation of short segments. This often results in wrong choices in case of homonyms in the source language, wrong split of propositions in long sentences, lots of repetitions etc. Also, machines seem to ignore basic life facts, such as only female give birth. So the translation of “They gave birth to their babies” is invariably “Ilsont donné naissance à leurs bébés” and not “Ellesont donné naissance à leurs bébés”. More disturbingly, when we want to translate “he ate his date”, instead of “il a mangé sa date”, Google Translate provides “Il a mangé son rendez-vous” and DeepL even decides to add up slang to the delightful “Il a mangé son rencard“. Not very vegan.
That said, machine translation is generally a good feeder for Computer Assisted Translation, which brings us to the next mistake.
4) Blindly trusting the segment-based text proposed by our CAT software
Computer Assisted Translation speeds up translation massively. It saves all the time spent translating and typing trivial pieces of text such as “the red car”, “his name was Joe” and “the sky was gray and it was likely to rain”. However, CAT cannot be trusted blindly. CAT translation is based on segmentation. The text is split in small parts, containing one or a few sentences. The software then suggest translations for each segment.
Firstly, some of those translations might come from machine translation, e.g. Google Translate or DeepL. Thus, see point 3. But very often the translations come from Translation Memories. Translation memories come with their own problems. Sometimes the translations proposed are plainly wrong, with missing words or wrong sentence parsing (resulting in wrong adjective associations for adjectives or verbs for instance). Another important issue is error propagation. If a segment was badly translated once, and this translation was recorded in TMs, it will be proposed in future translations.
A very important issue is the fact that the translations proposed for a segment is done purely on this segment, independently of the content of other segments of the text. There is rarely enough context in a single segment to discriminate between different meanings of a term.
Finally, the segmentation largely follows the punctuation in the source language. Depending on the translation, for instance in literary works where one needs to keep a style and rhythm, the optimal split might be different in the target language. Fortunately, CAT tools offer segment split/merge facilities.
5) Assuming the source document is right
This is a thorny issue. The basic position is that the source language document is correct, and we need to faithfully translate it. But this is not necessarily the case. Everyone makes mistakes, even the most thorough writers. Some mistakes are easy to spot and to correct, and many should not affect the translation, such as unambiguous spelling errors. However, others will be much harder to detect. For instance, words with similar pronunciations in English (the ubiquitous “complimentary” for “complementary”, “add” for “had”, “your” for “you’re” or the dreadful “of” for “have”), or absence of accents (or incorrect ones) in French, will lead to completely wrong translations. In many case, the context will provide a quick answer, but sometimes a bit more brain juice is needed. We should always double check that we understood the text correctly, and that our chosen translation is the only one.
Finally, horror, some “errors” are made on purpose, for stylistic reasons. In the case of a novel or a play, wrong grammar or vocabulary might be part of the plot or a defining feature of a character. In that case, we probably must provide a translation that contain a correct equivalent of the initial erroneous text …
6) Forgetting to double check the punctuation
OK, that might actually be a specific version of the previous error. Translators are linguists, and as all linguists, we are in love with punctuation (aren’t we?). Is there anything that beats the Oxford comma as a favorite topic for conversation? (except perhaps split infinitives) Surprisingly enough, this is not the case of every person, or even every writer. Punctuation can be a life saver in the case of very long and complex sentences. It can also be a killer in case it is absent, or, heaven forbid, wrongly placed. For instance, observe the following bit of text: “an off-flavour affecting negatively the positive fruity and floral wine aromas known as Brett character.”
What is the “Brett character”? (enlightened disciples of Bacchus, lower your hand). Is it the positive fruity and floral wine aromas? Or is it the off-flavour? It is, in fact, the latter, a metallic taste given by some yeast (from the genus Brettanomyces). Of course, the answer would be much clearer if the source sentence was:
“an off-flavour affecting negatively the positive fruity and floral wine aromas, known as “Brett character”.”
But let’s not add punctuation to Guillaume Appolinaire’s poetry, and keep Le Pont Mirabeau free of punctuation. Actually, the following translation of La Tour Eiffel might be one of the truest poetry translation ever, respecting the meaning, the style, and the shape.
7) Not paying attention to the mainstream use bias
This error is often a side-effect of using CAT tools with TMs or MT. The proposed translations will often rely on the most frequent meaning of a term, and its most frequent translation. This is not necessarily the meaning which is the right one, or the best one, for the current source document.
Sometimes, this is just irritating. For instance, in a literary text talking about “petits détours”, CAT will keep suggesting “small detours”. While this is correct, it does not fully convey the idea carried by “petits” here. It is too bland too quantitative, and “little detours” is the best translation, as shown here, here and here.
However, the mistake can be more severe. Google Translate tells us the story of a dreadful mum, “She put a bow in her daughter’s hair” being translated into “Elle a mis un arc dans les cheveux de sa fille”. That must have hurt terribly. As was the case for the poor lad who “entered a ball” and ended up “entré dans un ballon” (GT) or even “entré dans une balle” (DeepL), instead of “entré dans un bal”. Not much room to dance there. Sometimes, the mainstream use is actually overridden by the politically correct one, and the saucy “he was nibbling at her tit” is translated into “il mordillait sa mésange”. Except if we are talking about a cat, that is a disturbing image instead of a titillating one. While those examples were a bit joky, some cases are harder to spot. Someone who planted “Indian flags” in their garden will almost always end up in French exhibiting their nationalism rather than their love of irises.
In some cases, the various meanings have similar frequencies in daily use, and different tools provide alternative suggestions. DeepL will suits plumbers providing “installer un compteur” for “To set up a counter”, while Google Translate will lean towards merchants with “mettre en place un comptoir“.
8) Trying to stick 100% to the words of the source text
The true meaning of a word goes beyond its definition in a thesaurus. They carry different weight in different languages. The rude word meaning faeces is used as an interjection in almost every language. However, the level of rudeness is different in all western European countries, and sometimes choosing another rude word of the adequate level is better (no, we will not provide examples). And of course, there are very few cases where anyone should translate “it rains cats and dogs” into “il pleut des chats et des chiens”. One should always translate it into “il pleut comme vache qui pisse” (it rains as if a cow was pissing). While the new image is no so much better, at least no animal is hurt.
9) Trying to stick 100% to the structure of the source text
Trying to reproduce absolutely the structure of the source document is very tempting and encouraged by the segmentation process of CAT tools. However, this is lazy. English sentences are known to be shorter than French ones. Therefore, translating a sentence from the latter language might require several in the former. Let’s not speak of German where an entire sentence might end up in a single word! As usual, first comes the meaning, then the rhythm, then the style. Not only this requires to merge/split sentences, it might also require swapping propositions or sentences.
10) Not reading back the complete resulting translation
Last but not least, we should never forget to re-read attentively the entire translation. In the profession, proofreading is often mentioned as an activity disconnected from translation. But no translation work should be considered complete without a proofreading step! This is even more important if CAT software were used. They are known to promote “sentence salads”, where heterogeneous texts, in style and vocabulary, are caused by using the memory of many previous translations.
What about yourself? Which mistake did you make when learning how to become an accurate and efficient translator?
Since an important part of our activity is to assist researchers by improving their documents, whether grant applications, research publications or project reports, this blog will come back to this topic on a regular basis. We will write in depth about every aspects of scientific writing in turn. In this initial post let’s go wide and list a few general rules to improve research papers (although most of those apply to grant applications too).
1) Find your message
Of course, any body of scientific research brings about multiple results, that in turns affect the way we understand several aspects of reality, or can lead to a few technical developments. However, in order to maximize the impact of your account, you must choose an angle. What is the main point you want people to remember? What would be the sentence accompanying a tweet linking to your paper?
2) Know your audience
Once you have settled on your message, you need to select the population you want to “sell it too”, on which you expect the maximum impact. By audience, we mean first and foremost the editor and the reviewers. Yes, your ultimate aim is to spread the news through your community. But the paper needs to be published first … Knowing who you are talking to will help you structure the paper, as much as the instructions to authors. What will you put in the main body of the paper and what will you demote to supplementary materials? What should you present in detail or on the contrary gloss over? How will you present the methods and the results? Experimental biologists tend to dislike equations. Biochemists do not care much about the illustrating experiment, but want quantitative tables. Molecular biologists love histograms, and they like to have one illustrating experiment such as a a western blot or a microscopy field.
3) Build a storyline
Keeping in mind both the message you chose and the audience you want to sell it to, create a progressive demonstration that brings the reader to the same conclusions as yours, keeping them focused until the final bang. This is not necessarily how things really happened, in particular chronologically. We all know that scientific research is a complex process, that includes iterative explorations, validations and controls, explode dead ends etc. There is no need to list everything path you explored, every experiment you did. A paper is not a laboratory notebook. However, as you progress towards in your narrative, each step need to naturally lead to the next, while the controls you describe preclude wandering off-path.
4) Keep facts and ideas where they belong
Do not mix Introduction, Materials and Methods, Results and Discussion (in some article structures, the last two can be included in the same section, but the relevant bits are generally in different paragraph). The Introduction should only present and discuss what was known before your work, and only what is needed to understand your work and its context. Similarly, the Materials and Methods section should not describe new techniques or materials. And finally, all your results should be in the Results section (if separate from the Discussion). As a rule, if you remove everything but the Results section, you should not affect the storyline described above. Our advice is to start writing the Results, the add the necessary Materials and Methods as you go, write the Discussion, and finish by the Introduction. You cannot write an effective introduction if you do not know what you want to introduce. Moreover, writing the introduction is often used as a procrastination device …
5) Build modules
Use one paragraph for one idea, if possible linked to one experiment, and illustrated by one figure (whether the figure ends up in the main body or in supplementary materials does not matter). This is sometime challenging when the idea or the experiment are complex. But even if, in the end, paragraphs are merged or split, adopting this approach is useful during the initial writing stage. This will help building the storyline and will enable easy restructuring later. Give a title to each module. Try drawing a flowchart representing your results, and annotate it with experiments and figures.
6) Do not assume knowledge but do not state the obvious
Moving from the structure to the style now. It is always difficult to decide what is common knowledge and therefore should not be explicit, and what is specialized knowledge required to understand your story and accept your message. Do not rely on your own knowledge. Go back to point 2), and make a real effort to put yourself in the shoes of the intended readership. Then a guideline can be to introduce factual knowledge but not common – in your audience’s domain – technical knowledge. For instance, if you write for biologists, you should not assume that people know the Kd between PKA and cAMP is 0.1 micromolar. However, you may assume that readers know increased affinity means lower Kd.
7) Do not repeat information
Building your text following 5) should preclude the description of the same information twice. When you refer to a piece of information described before, you do not need to restate it. This goes for details of experiments as well. If you already stated that an experiment took place in a chemical reactor, there is no need to mention it all the time, as “the solution in the chemical reactor” “the temperature in the chemical reactor” etc.
8) Write clearly, concisely and elegantly
I strongly recommend reading, and keeping a copy at hand, “Style: Lessons in Clarity and Grace“. There are many simple habits you can use to improve the readability of a text. Avoid passive forms. Keep verbs for actions and nouns for what perform or is affected by such actions. Try to stick to one proposition per sentence, twice at most (and well articulated). Do not add words needlessly. Instead of writing “the oxidation of the metal”, write “metal oxidation”. A shorter text is read faster and is easier to memorize.
9) Avoid casual writing
A scientific article is not a diary (or a blog post …). Your reader is neither your mate nor your student. Avoid talking to them directly, e.g. “you can do this” or indirectly as “let’s consider this first”. This is not a huge deal, but it might be irritating for some people. In general, adopt the style commonly used in the most respectable journals of the community you are targeting.
10) Chase grammatical errors and spelling mistakes
They are less important than the scientific facts, and hopefully they will disappear during the proof-reading stage. However, they give a hint of sloppiness, and they will annoy the reviewers. Even if those reviewers do not belong to the type focusing on such things, unconscious biases can taint even the fairer person. So read your manuscript again and again, slowly and aloud. More importantly ask someone else, if possible not a co-author, to read it as well.