Scientists, do not make assumptions about your audience!

This is a post I could have written thirty years ago. The tendency of scientists (or any specialist really) to write texts assuming a similar level of background knowledge from their audience has always been a curse. However, with the advent of open access and open data, the consequences have become dearer. Recently, in what is probably one of the worst communication exercises of the COVID-19 pandemics, the CDC published an online message ominously entitled:

“Lab Alert: Changes to CDC RT-PCR for SARS-CoV-2 Testing”

Of course, this text meant to target a particular audience, as specified on the web page:

“Audience: Individuals Performing COVID-19 Testing”

However, the text was accessible to everyone; including many people who could not properly understand it. What did this message say?

“After December 31, 2021, CDC will withdraw the request to the U.S. Food and Drug Administration (FDA) for Emergency Use Authorization (EUA) of the CDC 2019-Novel Coronavirus (2019-nCoV) Real-Time RT-PCR Diagnostic Panel, the assay first introduced in February 2020 for detection of SARS-CoV-2 only. CDC is providing this advance notice for clinical laboratories to have adequate time to select and implement one of the many FDA-authorized alternatives.”

This sent people already questioning the tests into overdrive. “We’ve always told you. PCR tests do not work. This entire pandemic is a lie. We’ve been termed conspiracy theorists, but we were right all this time.” The CDC message is currently circulated all over the social networks to demonstrate their point.

Of course, this is not at all what the CDC meant. The explanation comes in the subsequent paragraph.

“In preparation for this change, CDC recommends clinical laboratories and testing sites that have been using the CDC 2019-nCoV RT-PCR assay select and begin their transition to another FDA-authorized COVID-19 test. CDC encourages laboratories to consider adoption of a multiplexed method that can facilitate detection and differentiation of SARS-CoV-2 and influenza viruses. Such assays can facilitate continued testing for both influenza and SARS-CoV-2 and can save both time and resources as we head into influenza season.”

The CDC really means that rather than using separate tests to detect SAR-COV-2 and influenza virus infections, the labs should use a single test that detects both simultaneously, hence the name “multiplex”. 

I have to confess that it took me a couple of readings to properly understand what they meant. What did the CDC do wrong?

First, calling those messages “Lab Alert”. For any regular citizen fed by Stephen King’s The Stand and movies like Contagion, the words “Lab Alert” mean “Pay attention, this is an apocalypse-class message”. What about “New recommendation” or “Lab communication”?

Second, the CDC should not have been assumed that everyone knew what the “CDC 2019-nCoV RT-PCR assay” was. Out there, people understood that the CDC was talking about all the RT-PCR assays meant to detect the presence of SARS-CoV-2, not just the specific test previously recommended by the CDC*.

Third, the authors should have clarified that “the many FDA-authorized alternatives” included other PCR tests, and the message was not meant to say that the CDC recommended ditching the RT-PCR tests altogether.

Finally, they should have clarified what a “multiplexed method” was. I received messages from people who believed a “multiplexed method” was an alternative to a PCR test, while it is just a PCR that detects several things simultaneously (in this example SARS-CoV-2 and flu viruses).

In conclusion, you can, of course, and should, think about your intended audience. However, you should not neglect the unintended audiences. This is more important than you think and not restricted to general communications. Whether a research article or a grant application, whatever scientific piece you write will reach three audience types. 

  • The first comprises the tiny circle sharing the same knowledge background, typically reviewers (if the editors do their job properly…). 
  • The second will be made up of the population at large, who will not understand a word, and frankly, are not interested in whatever you are babbling about.
  • The third is the dangerous one. It is made of people who have a certain scientific background, sufficient to globally understand the context of your text but lack the advanced knowledge to precisely grasp your idea, its novelty, its consequences. These people will read your text and believe they understood your points. The risk is that they did not. Misunderstand your point might be worse than not understanding it.

It is always good to get your texts read by someone belonging to this third population before submitting them to the journals of funding agencies.

*There is actually another very interesting story related to this topic when, at the beginning of the pandemic, many labs proposed to use their own PCR tests but could not because only the CDC-recommended test could be used, delaying the implementation of mass testing by many weeks.

Âges, vaccination et infections

Par Nicolas Gambardella

Combien de fois voit-on ces jours-ci passer le commentaire suivant sur les réseaux sociaux : « La plupart des cas de covid-19 sont maintenant chez des personnes vaccinées. C’est la preuve que les vaccins ne fonctionnent pas. »

Pas vraiment, non.

Tout dépend des populations relatives de vaccinés et de non-vaccinés. Dans un précédent billet, j’ai présenté un résumé de l’efficacité des vaccins sur les différentes variantes du SARS-CoV-2. Chaque figure représentait l’efficacité globale. Cependant, les taux de vaccination dépendent de l’âge, car la plupart des pays ont commencé à vacciner les personnes âgées en premier. Voyons donc si nous pouvons être plus précis.

Public Health England a récemment publié la dernière version de son SARS-CoV-2 variants of concern and variants under investigation in England. Il présente les détails des infections par les variants identifiés chez les personnes vaccinées et non vaccinées. Concentrons-nous sur le variant Delta.

Mais, mais, mais… chez les personnes de plus de 50 ans, seuls 976 cas ont été recensés chez les non-vaccinés, tandis que 3953 personnes ayant reçu une dose et 3546 personnes entièrement vaccinées ont été infectées ! Ce vaccin n’offre donc aucune protection, CQFD ?

Pas si vite. Voyons si nous pouvons calculer l’efficacité du vaccin, d’accord ? Pour cela, nous avons d’abord besoin du taux de vaccination par tranche d’âge. Heureusement, ce taux est publié chaque semaine par Public Health England. Comme le tableau porte sur les cas déclarés jusqu’au 21 juin, nous utiliserons les données publiées le 24 juin, qui comprenaient les vaccinations jusqu’au 20 juin. Bien sûr, tous les cas Delta ne sont pas apparus le 20 juin. Cependant, la plupart d’entre eux sont apparus au cours des derniers mois. De plus, l’administration de la 2e dose a atteint un plateau pour la population âgée.

Ensuite, nous devons savoir combien de personnes appartiennent à chacune de ces tranches d’âge. Pour cela, nous pouvons utiliser la population de 2020 prévue par l’Office for National Statistics sur la base des chiffres de 2018 (la pyramide des âges indique des pourcentages pour chaque année, mais nous pouvons télécharger les chiffres réels pour chaque tranche de 5 ans d’âge).

Nous pouvons maintenant calculer, pour chaque tranche d’âge, combien de personnes ont reçu deux doses, une seule dose ou ne sont toujours pas vaccinées (j’additionne les hommes et les femmes).

Âge1 dose2 dosesnon-vaccinés
0-17 56230 56584 14033150
18-24605991 726010 4318151
25-29 837303 728416 2924493
30-341587417 847650 2100138
35-39 1786373 10030801628239
40-44 1702290 1268632 1127652
45-491583028 1838131 890384
50-54548632 3378858 690501
55-59 396658 3567857 549185
60-64202651 3272462 387171
65-6995785 2998587 268383
70-7459031 3118291 191577
75-79 41180 2259096 111907
80+84457 3159787 164372
total9587027 28223441 29385301

Ces chiffres font apparaître 24118034 personnes de plus de 50 ans, 21754937 avec deux doses et 2363096 non vaccinées, dix fois plus de personnes complètement vaccinées ! Ainsi, les 3546 et 976 cas représentent 0,0163 % et 0,0413 % des populations respectives. En d’autres termes, la vaccination complète offre une protection de 60,5 % contre le variant Delta.

Le même calcul sur les moins de 50 ans montre une protection encore meilleure, à 70,8 % (ce qui montre encore qu’il faut vacciner les plus jeunes si nous voulons protéger les plus vieux et se débarrasser de ce virus).

Plus la couverture vaccinale est bonne, plus on observera de cas dans la population vaccinée. Cela ne signifie pas que le vaccin n’est pas efficace !

Ages, vaccination and infections

By Nicolas Gambardella

How many times are we seeing the following comment on social media those days: “Most covid-19 cases are now in vaccinated people. This is the proof that vaccines don’t work”.

Not quite.

It all depends on the relative populations of vaccinated versus unvaccinated. In a previous post, I presented a summary of vaccine effectiveness on different SARS-CoV-2 variants. Each figure represented the global effectiveness. However, vaccination rates depend on age since most countries started to vaccinate the elderly first. So let’s see if we can be more precise.

Public Health England recently published its latest SARS-CoV-2 variants of concern and variants under investigation in England. It contains the details of infections by identified variants in vaccinated and unvaccinated people. Let’s focus on the Delta variant.

Whaaaat? In people over 50 years of age, 0nly 976 cases in unvaccinated, while 3953 people with one dose and 3546 fully vaccinated people were infected! Surely this vaccine does not offer any protection, right?

Let’s see if we can compute the vaccine effectiveness, shall we? For that, we need first the vaccination rate per age. Fortunately, this is published by Public Health England every week. Since the table about report cases up to June 21, we will use the vaccinations data published on June 24, including vaccinations up to June 20. Of course, not all the Delta cases have appeared on June 20. However, most of them have appeared in the past few months. Moreover, administration of the 2nd dose has plateaued for the elderly population.

Then, we need to know how many people belong to each of those age groups. For that, we can use the 2020 population predicted by the Office for National Statistics based on the 2018 figure (the age pyramid shows percentages for each year, but we can download actual numbers for each 5-years age group).

We can now compute for each age group how many people had two doses, only one dose, or are still unvaccinated (I sum up males and females).

Age1 dose2 dosesunvaccinated
0-17 56230 56584 14033150
18-24605991 726010 4318151
25-29 837303 728416 2924493
30-341587417 847650 2100138
35-39 1786373 10030801628239
40-44 1702290 1268632 1127652
45-491583028 1838131 890384
50-54548632 3378858 690501
55-59 396658 3567857 549185
60-64202651 3272462 387171
65-6995785 2998587 268383
70-7459031 3118291 191577
75-79 41180 2259096 111907
80+84457 3159787 164372
total9587027 28223441 29385301

These numbers show 24118034 people over 50, 21754937 with two doses and 2363096 unvaccinated, tenfold more fully vaccinated! Thus, the 3546 and 976 cases represent 0.0163% and 0.0413% of the respective populations. In other words, the full vaccination offers 60.5% protection against the Delta variant.

The same calculation on under-50 shows even better protection at 70.8% (This, again, shows that we must vaccinate young people if we want to protect the older ones and get rid of this virus.

The better the vaccine coverage, the more cases will be observed in the vaccinated population. This does not mean the vaccine is not effective!

Des vaccins et des variants

Par Nicolas Gambardella

Depuis le développement des premiers vaccins contre le SARS-CoV-2, j’ai collectionné les données sur leur efficacité. Cette efficacité est continuellement remise en cause par l’apparition de virus variants, c’est-à-dire de nouvelles souches porteuses d’un groupe caractéristique de mutations. Avec autant de vaccins et autant de variants, il devient difficile de rester à jour. Ce problème est aggravé par l’abondance de publications présentant des types d’évaluations différents. Ainsi, bien qu’il soit très important de garder trace de toutes les valeurs et de leurs intervalles de confiance, j’ai pensé qu’il serait bon d’avoir une vue d’ensemble simplifiée de la situation actuelle.

La figure ci-dessous représente l’efficacité globale des principaux vaccins contre les principaux variants sous forme de pourcentages visuels. Les points bleus représentent les personnes protégées qui auraient été infectées sans vaccination. Les points gris représentent les paires {vaccin, variante} pour lesquelles on ne dispose pas de suffisamment de données. Ces nombres représentent la protection contre l’infection, et non la protection contre la maladie ou le décès (pour lesquels la protection est probablement plus élevée). De plus, ils sont obtenus après le protocole de dosage recommandé pour chaque vaccin.

Ces données sont les estimations les meilleures et les plus fiables au moment où j’écris ce billet (mise à jour le 26 juillet 2021). J’ai privilégié les données de vie réelle aux essais cliniques, l’efficacité directement mesurée à l’efficacité déduite des tests de neutralisation (où le sérum de personnes vaccinées est utilisé in vitro sur des virus ou des protéines recombinantes), et les données indépendantes aux données fournies par les fabricants de vaccins. J’ai omis certains vaccins autorisés en raison de la rareté des données (et de leur faible utilisation). Certaines des données utilisées pour faire la figure sont connues pour leur particularité et ont fait l’objet de critiques. Cependant, il n’existe rien de mieux. Espérons que ces graphiques deviendront plus précis à mesure que d’autres études seront publiées.

On vaccines and variants

By Nicolas Gambardella

Since the development of the first vaccines against SARS-CoV-2, I have gathered data about their efficacy. Unfortunately, this efficacy is continuously challenged by the appearance of variant viruses, i.e., novel strains carrying a bunch of mutations. With so many vaccines and so many variants, it becomes difficult to keep track of the data. This is compounded by the abundance of publications presenting different types of evaluations. So, while keeping track of all the values and their confidence intervals is very important, I thought it would be nice to have a single overview of where we stand.

The figure below represents the overall efficacy of the main vaccines for the main variants as visual percentages. The blue dots represent protected people who would have been infected without the vaccines. Grey dots represent pair {vaccine, variant} for which not enough data is available. This figure represents the protection from infection, not the protection from disease or death (which are likely higher). The figures are those achieved after the recommended dosing protocol for each vaccine.

These numbers are the best and most reliable estimates as I write this post (updated 26th July 2021). I privileged real world data over clinical trials, directly measured efficacy over efficacy inferred from neutralisation assays (where the serum of vaccinated individuals is used in vitro with viruses or recombinant proteins), and independent data over data provided by vaccine manufacturers. I omitted some authorised vaccines because of data scarcity (and low usage). Some of the data used to plot the graph are known to present peculiarities and raised issues. However, nothing better is available. Hopefully, these plots will become more accurate as more studies are published.

Comment traduire « evidence-based medicine » ?

By Nicolas Gambardella

Abordons aujourd’hui une question d’actualité, et qui me tient à cœur, ce que l’on appelle en anglais « evidence-based medicine ». Comment traduire cette expression en français ?

Tout d’abord de quoi parlons-nous ? Depuis des temps immémoriaux, la médecine est un art, et les médecins sont des artisans. Autrement dit, après une formation initiale auprès de mentors, le médecin peaufine ses connaissances et sa réflexion sur la base de son expérience professionnelle. Cette approche présente des inconvénients qu’il n’est pas besoin de développer. Cet état de chose a commencé à évoluer au XIXe siècle avec la « médecine expérimentale » de Claude Bernard, puis au XXe avec l’arrivée de la pharmacologie moléculaire et de l’accélération des connaissances en biologie humaine. La transition de la médecine d’un art en une science s’est parachevée il y a un demi-siècle avec la généralisation des essais cliniques contrôlés, où l’on tâche d’éliminer l’arbitraire personnel et d’évaluer la validité des observations en utilisant des statistiques, souvent sophistiquées. L’avènement récent des données moléculaires à haut débit a ajouté à cet « evidence-based medicine » un aspect de précision et de personnalisation.

Ce qui nous amène à l’utilisation d’un faux ami. « Evidence » est le mot utilisé dans les tribunaux anglo-saxons équivalent au mot français « preuve ». En revanche, en science, « preuve » se dit « proof ». Cette dernière acception est beaucoup plus forte que la précédente (on pourrait du reste discuter longuement sur la différence de statut des « preuves » dans les tribunaux francophones et des « evidences » dans les tribunaux anglo-saxons). Andrew Wiles a fourni la preuve de la conjecture de Fermat (qui devrait donc s’appeler Théorème de Wiles…). Ce théorème de Fermat est toujours vrai. Pour des entiers strictement positifs x, y, z, il n’existe aucun n>2 tel que x2+y2=z2. Ce résultat est vrai, et le restera toujours.

En revanche, les résultats d’une expérience biologique ou d’un essai clinique nous renvoient une image beaucoup plus nuancée. Tout d’abord, les résultats sont associés à un niveau de confiance. Si la valeur p est de 0,05 (une valeur souvent utilisée en statistique médicale), cela veut dire qu’il y a 5 % de chances que le résultat soit dû au hasard (C’est un peu plus compliqué que ça, mais ce n’est pas le sujet du billet). De plus, des résultats différents pourraient être obtenus avec une autre cohorte, présentant d’autres caractéristiques, soit évidentes (âge, sexe, état de santé) soit plus subtiles (une proportion différente d’haplotypes clés entre les groupes témoins et traités). D’où l’existence des méta-analyses, qui permettent de réconcilier plusieurs essais cliniques.

Le résultat d’un essai clinique est très respectable et doit être une référence en l’absence d’information contraire (et de situations particulières comme les circonstances du patient, la disponibilité et le prix des traitements, etc.). Mais ce n’est pas une « preuve ». Je m’insurge donc contre la traduction de « evidence-based medicine » en « médecine fondée sur les preuves », bien qu’elle soit la plus utilisée. C’est selon moi un mauvais anglicisme.

Puisque nous en sommes au chapitre des anglicismes, évacuons de suite le « basé sur ». L’académie française nous dit :

« On s’accorde aujourd’hui pour employer Baser sur dans le domaine militaire et l’y réserver : Des troupes ont été basées sur la frontière. On évitera donc l’emploi figuré, transposition de l’anglais based on, qui s’est abusivement répandu, et on lui préfèrera des synonymes comme Fonder, Établir ou Asseoir. »

Comment dès lors traduire « evidence »? On pourrait, comme Wikipedia, utiliser l’aspect factuel du résultat, et utiliser « médecine fondée sur les faits ». Mais là encore, on confond le résultat et la conclusion. Le résultat de l’essai clinique, qu’un traitement administré selon un certain schéma thérapeutique à une cohorte donnée a probablement entraîné avec une probabilité supérieure à 0,95 une amélioration en moyenne de X %, 95 % des mesures se trouvant dans un intervalle donné autour de X, est un fait. La conclusion, à savoir que le traitement entraîne une amélioration de X % n’en est pas un.

À « fait », je préférerais donc « données », qui est… de fait (sic) le nom le plus utilisé après « preuve ». Au final, le praticien utilisera ces données, en conjonction avec les données venant d’autres essais, de surveillances longitudinales (cohortes ou expérience personnelle), des circonstances du patient, des circonstances géographiques, temporelles, et financière pour décider de la marche à suivre.

Et point n’est besoin de rajouter un adjectif pour ré-introduire la preuve par la petite porte comme on voit souvent avec « médecine fondée sur les données probantes ». Et si par « données probantes » on entend juste des données auxquelles on peut se fier, on tombe dans la tautologie. Si une donnée n’est pas fiable, pourquoi la prendre en compte ?

Evidence-based medicine = Médecine fondée sur les données

Get your documents checked before getting them translated

By Nicolas Gambardella

In some domains, the most challenging part of language translation (in a broad sense) is the translation itself (in a narrow sense), i.e., converting the words from one language to the other while accurately conveying the meaning and the tone of the original document. In the scientific and technical domains, this is not always the case. It is not unfrequent that most of the time I spend on a text is, in fact, devoted to understanding the source in English. 

The main reason is that many of those documents are not written by domain specialists proficient in the good William’s language (Shakespeare, not The Conqueror). Most people working in highly technical domains, such as biomedical and pharmaceutical, have been reading, writing, and speaking English for many years. They have produced research publications, technical reports, grant applications, and lectures for international audiences. Communicating with others in English has never been a problem. When times come to write a brochure, a presentation for HCPs or patients, or a website, they naturally assume their usual English level is sufficient. Rather than spending time speaking with a professional writer, who would struggle to understand the technical subtleties and cost money, isn’t it quicker to do the job yourself? After all, who knows better than yourself what you want to say?

I think this is true. The initial raw material should come from the horse’s mouth. However, we should all be acutely aware that being able to converse with our colleagues is rarely sufficient when producing patient- or consumer-ready documents. During a conversation, half the meaning is conveyed through body language, visual support, and implicit shared knowledge. Your English colleague knows what you truly mean when you use those dozens of anglicised French words (replace by your own native language). They might even find that charming. Much less charming will it be for a potential client or your poor translator. The former might be put off by what could be perceived as a lack of professionalism. The latter might mistranslate your document, with potentially dire consequences.

I recently finished reviewing a medical marketing brochure for a foreign company. Many sentences were grammatically incorrect, to the point of becoming meaningless. Most sentences were convoluted, too long, repetitive. The paragraphs were heavy and hard to read. A significant amount of words were slightly off, definitely not what a native English marketing brochure would have used (or any native text for that matter). Finally, the formatting was completely inconsistent (e.g., usage of capitals or abbreviations). 

I do not think this brochure reflected well on an otherwise excellent company. I believe that if the foreign person who wrote the English text had taken the time to reverse-translate it with a tool like DeepL, they would have been horrified to see that the result was not at all what they intended. I hope the French translation will alleviate some of the issues. I also provided a complimentary list of suggestions for the English version, as I often do when translating such documents.

We should always ask someone else to edit and proofread our texts. If possible, this should be someone with no in-depth knowledge of the subject. Ideally, the pipeline would comprise several verification layers, possibly combined in fewer individuals or, on the contrary, repeated with several people:

  • Verification of the technical content. Are you even using the correct English words?
  • Marketing and localisation. The US is not the UK. Patients are not HCPs. HCPs are not researchers.
  • Proofreading. Itself with three subcategories: language correctness (grammar, punctuation, and spelling); elegance and fluidity; terminology and visual consistency throughout the document(s).

Now, to finish on a lighter note, some language-specific advice for scientists:

Starting, of course, by the Frenchs. My dear fellow countrymen, the fact that English and French share the same sentence structure does not mean you can replace the words one by one and keep the French term if you do not know the English equivalent.

It would be best if Italians gave a subject to all verbs, who feel lonely otherwise.

As bothersome as it seems, Russians must use articles in front of nouns, at least from time to time.

Chinese should realise that spelling in Latin is as essential as the correct stroke in Hànzì. Vowels are not interchangeable.

While the pursuit of accuracy is laudable, German writing in English should seek to limit the number of words in their sentence to double digits.

To finish, I would love to hear all your comments, corrections, and criticisms regarding this post.

Getting the best value for a model parameter

By Nicolas Gambardella

A crucial part of any computational modelling is getting parameter values right. By computational model, I mean a mathematical description of a set of processes that we can then numerically simulate to reproduce or predict a system’s behaviour. There are many other kinds of computational or mathematical models used in computational biology, such as 3D models of macromolecules, statistical models, machine learning models and more. While some concepts dealt with in this blog post would actually be relevant, I want to limit the scope of this post to what is sometimes called “systems biology” models.

So, we have a model that describes chemical reactions (for instance). The model behaviour will dictate the values of some variables, e.g. substrate or product concentrations. We call those variables “dependent” (“independent variables” are variables whose values are decided before any numerical simulation or whose values do not depend on the mathematical model, such as “time” for standard chemical kinetics model). Another essential set of values that we have to fix before a simulation consists of the initial conditions, such as initial concentrations.

The quickest way to get cracking is to gather the variable values from previous models (found for instance in BioModels), from databases of biochemical parameters such as Brenda or SABIO-RK, or from patiently sieving scientific literature. But what if we want to improve the values of the variables? This blog post will explore a few possible ways forward using the modelling software tool COPASI, namely, sensitivity analysis, picking up variable values and looking at the results, parameter scans, optimisation, and parameter estimation.

Loading a model in COPASI

First, we need to have a model to play with. The post will use the model of MAPK signalling published by Huang and Ferrell in 1996. You can find it in BioModels where can download the SBML version and import it in COPASI. Throughout this post, I will assume the reader masters the basic usage of COPASI (create reactions, run simple simulations, etc.). You will find some introductory videos about this user-friendly, versatile, and powerful tool on their website.

The model presents the three-level cascade activating MAP kinase. MAPK, MAPKK, and MAPKKK mean Mitogen-activated protein kinase, Mitogen-activated protein kinase kinase, and Mitogen-activated protein kinase kinase kinase, respectively. Each curved arrow below represents three elementary reactions: binding and unbinding of the protein to an enzyme, and catalysis (addition or removal of a phosphate).

The top input E1 (above) is called MAPKKK activator in the model. To visualise the results, we will follow the normalised values for the active (phosphorylated) forms of the enzymes K_PP_norm, KK_PP_norm and K_P_norm, that are just the sums of all the molecular species containing the active forms divided by the sums of all the molecular species containing the enzymes (NB: Throughout the screenshots, the red dots are added by myself and not part of COPASI’s GUI).

Let’s run a numerical simulation of the model. Below you see the activation of the three enzymes, with the swift and supra-linear activation of the bottom one, MAPK, one of the hallmarks of the cascade (the others being an amplification of the signal and an ultrasensitive dose-response which allows to fully activate MAPK with only a touch of MAPKKK activation).

Sensitivity analysis

The first question we can ask ourselves is “What are the parameters that values affect the most the output of our system?”. To do so, we can run a sensitivity analysis. COPASI will vary a bit all the parameters and measure the consequences of these changes on the results of a selected task, here the steady-state of the non-constant concentrations of species.

We see that the most important effect is the impact of MAPKKK activator binding constant (k1) on the concentration of PP-MAPK, which happens to be the final output of the signalling cascade. This is quite relevant since the MAPKKK activator binding constant basically transmits the initial signal at the top of the cascade. You can click the small spreadsheet icon on the right to access coloured matrices of numerical results.

Testing values

All right, now we know which parameter we want to fiddle with. The first thing we can do is visually look at the effect of modifying the value. We can do that interactively with a “slider“. Once in the timecourse panel, click on the slider icon in the toolbar. You can then create as many sliders as you want to set values manually. Here, I created a slider that can vary logarithmically (advised for most model parameters) between 1 and 1 million. The initial value, used to create the timecourse above, was 1000. We see that sliding to 100 changes the model’s behaviour quite dramatically, with very low enzyme activations. Moving it well above 1000 will show that we increase the speed of activation of the three enzymes, increase the activation of the top enzyme, albeit without significant gain on K-PP, our interesting output.

Parameter scans

Playing with sliders is great fun. But this is not very precise. And if we want to evaluate the effect of changing several parameters simultaneously, this can be extremely tedious. However, we can do that automatically thanks to COPASI’s parameter scans. We can actually repeat the simulation with the same value (useful to repeat stochastic simulations), systematically scan parameter values within a range, or sampling them from statistical distributions (and nest all of these). Below, I run a scan over the range defined above and observe the same effect. To visualise the scan’s results, I created a graph that plotted the active enzyme’s steady-state versus the activator binding constant.

Optimisation

All that is good, but I still have to look at curves or numerical results to find out the best value for my parameter. Ideally, I would like COPASI to hand me directly the value. This is the role of optimisation. Optimisation if made up of two parts: the result I want to optimise and the parameter to change to optimise it. I will not discuss the possibility to optimise a value. There are many cases for which optimisation is just not possible. For instance, it is not possible to optimise the production of phosphorylated MAPK. Whatever upper bound we would fix for the activator binding constant, the optimal value would end up on this boundary. In this example, I decided to maximise the steady-state concentration of K_PP for a given concentration of KKK_P, i.e. getting the most bang for my buck. As before, the parameter I want to explore is the MAPKKK activator binding constant. I fix the same lower and upper bound as before. COPASI offers many algorithms to explore parameter values. Here, I chose Evolutionary Programming, which offers a good balance between accuracy and speed.

The optimal result is 231. Indeed, if we look at the parameter scan plot, we can see that with a binding constant of 231, we get an almost maximal activation of MAPK with minimal activation of MAPKKK. Why is this important? All those enzymes are highly connected and will act on downstream targets right, left, and centre. In order to minimise side effects, I want to activate (or inhibit) protein as little as necessary. Being efficient at low doses also helps with suboptimal bioavailability. And of course, using 100 times less of the stuff to get the same effect is certainly cheaper, particularly for biologics such as antibodies.

Parameter estimation

We are now reaching the holy grail of parameter search, which is parameter estimation from observed results. As with optimisation, this is not always possible. It is known as the identifiability problem. Using the initial model, I created a fake noisy set of measurements, which would, for instance, represent the results of Western blot or ELISA using antibodies against phosphorylated and total forms of RAF, MEF, and ERK, which are specific MAPKKK, MAPKK, and MAPK.

I can load this file (immuno.txt on the screenshot) in COPASI, and map the experimental values (automatically recognised by COPASI) to variables of the model. Note that we can load several files, and each file can contain several experiments.

I can then list the parameters I want to explore, here the usual activator binding constant, between 1 and 1 million. Note that we can use only some of the experiments in the estimation of given parameters. This allows building extremely powerful and flexible parameter estimations.

We can use the same algorithms used in optimisation to explore the parameter space. The plot represents the best solution. Dashed lines link experimental values, continuous lines represent the fitting values, and circles are the error values.

The value estimated by COPASI for the binding constant is 1009. The “experiment” was created with a value of 1000. Not bad.

This concludes our overview of parameter exploration with COPASI. Note that this only brushes up the topic and I hope I picked your curiosity enough for you to read more about it.

Do you have a model that you want to improve? Do you need to model a biological system but do not know the best method or software tool to use? Drop me a message and I will be happy to have a chat.

De la position des adjectifs

Par Nicolas Gambardella

Il faut toujours mettre l’adjectif au plus près du nom qu’il qualifie. Cela semble évident, n’est-ce pas ? Et pourtant, ce n’est pas un réflexe, surtout dans les textes techniques qui ont tendance à utiliser une syntaxe anglo-saxone. Pourtant, la position de l’adjectif n’est pas neutre, et est au contraire essentielle pour la bonne compréhension du texte et son utilisation ultérieure.

Prenons l’exemple d’un test de grossesse se basant sur la mesure de l’hormone de croissance dans les urines. Faut-il écrire un « test urinaire de grossesse » ou bien un « test de grossesse urinaire » ? Le premier bien sûr, car c’est le test qui est urinaire, pas la grossesse. Au pluriel, il n’y a pas de problème. Dans « tests de grossesse urinaires » les S nous révèlent le lien entre le test et la substance testée. La situation est plus complexe au singulier. Un être humain n’aura bien entendu aucun mal à comprendre qu’une grossesse ne peut être urinaire (du moins on l’espère). En revanche, les systèmes d’analyse lexicale automatique (et un peu simplets, il faut bien le reconnaître) découperont l’expression en

[ TEST ] [ DE ] [ [ GROSSESSE ] [ URINAIRE ] ]

En revanche, un « test de grossesse extra-utérine » teste la localisation extra-utérine (ectopique) de la grossesse. Ici, la grossesse est extra-utérine, pas le test.

Tout cela paraît bien logique. Qu’en est-il en réalité ? Malheureusement, notre ami Google retourne 7970 réponses pour l’infortuné « test de grossesse urinaire » contre seulement 3310 réponses pour « test urinaire de grossesse » (encore que Google Trends donne la première forme gagnante sur le long terme). Le domaine académique fait un peu mieux, Google Scholar retournant 77 entrées pour « test urinaire de grossesse » contre 71 pour « test de grossesse urinaire » Mais il y a pire puisqu’on retrouve cette dernière forme dans des textes de référence comme LE Vidal, voire même des documents de la Haute Autorité de Santé.

Voyons maintenant un exemple plus compliqué, le « rapport normalisé international », traduction de l’anglais « international normalized ratio ». Est-ce vraiment « rapport normalisé international » pour qui Google nous donne 6610 réponses et Google Scholar 99 entrées, ou bien « rapport international normalisé » pour qui Google nous donne 15300 réponses et Google Scholar 234 entrées ? C’est le premier. Cette variable mesure le rapport entre le temps de coagulation chez un patient (le « temps de Quick ») et sa valeur chez un témoin. C’est en cela qu’elle est « normalisée ». Cette mesure étant reconnue internationalement, c’est le « rapport normalisé international ».

Pourquoi la situation est-elle plus compliquée qu’avec le test urinaire de grossesse ? Le mot clé est « normalisé ». Alors que la version anglaise n’est pas ambiguë, elle l’est en français, où l’adjectif « normalisé » est la traduction à la fois de « normalized », c’est-dire une valeur ramenée à l’échelle, et de « standardized », c’est-dire une valeur reconnue officiellement comme une norme. Dès lors, le « rapport international normalisé » pourrait avoir une toute autre signification, qui serait la version « standard » du « rapport international ». Je ne serais du reste pas surpris qu’une grande partie des positionnements malheureux trouvent leur origine dans cette confusion.

En conclusion, toujours placer l’adjectif au plus près du nom ou de l’expression qu’il qualifie. Et si vous traduisez un texte de l’anglais au français, n’hésitez pas à modifier l’ordre d’un enchaînement d’adjectifs en passant d’une langue à l’autre. Enfin, si vous avez le moindre doute, recherchez la signification de l’expression.

Pour traduire, il faut comprendre.

Tolerance, tolerability, innocuousness, safety, relationships, and differences

By Nicolas Gambardella

[French version]

When it comes to clinical trials and pharmacovigilance, using the right word is crucial for an accurate and precise shared understanding. Unfortunately, this is not always the case, even in documents written by specialists. While this can sometimes lead to a certain degree of vagueness in communication within a given language, the situation becomes worse when it comes to documents that need a translation. Especially as the use of given terms may be subject to hyponyms/hyperonyms relationships that differ according to language and context.

In this post, we will look at some terms that generate endless debate in professional forums, all related to safety efficacy data associated with treatments, namely: tolerability, tolerance, safety, innocuousness, and their French equivalents, tolerability, tolerance, safety and innocuousness.

Tolerance and tolerability

Tolerance to treatment is the word used in pharmacology and clinical trials to designate habituation, i.e. the fact that the same intensity of treatment leads to increasingly weaker effects (note that in the field of drug of abuse, the terms tolerance and addiction carry sometimes a subtle difference: Tolerance is the fact that a given dose leads to an increasingly weaker effect, whereas addiction means that higher and higher doses are needed to obtain a given effect). The French translation of tolerance is tolérance.

Tolerability characterises the subject’s ability to withstand adverse effects. While tolerance refers to efficacy data, tolerability refers to safety data. It is a precise term from clinical trials and is not found in any standard dictionary. The French translation of tolerability is tolérabilité. Let’s be honest, these terms are atrocious. However, when it comes to patient safety, the literary elegance and aesthetics of the word are less important than precision and accuracy. “Tolérabilité” is often criticized as “anglicism”, with critics encouraging the use of tolerance to translate tolerability (following in that many French dictionaries). This is forgetting that, in addition to being false, tolerability is itself originally an Anglicism. Perhaps one way to clarify things is to look at adjectives. A person is tolerant to a drug, while a drug is tolerable for a person.

Safety and innocuousness

We are now entering a turbulent zone where tempers flare and linguists grapple with each other. Most of the time, because they are not versed in the handling of ontologies and hyponym/hyperonym relationships. Do safety and innocuousness refer to different concepts? Yes. Do the concepts of safety and innocuousness overlap? Yes. Do safety and innocuousness always translate into the same French terms? No.

Let’s start with innocuousness. The innocuousness of a drug is its ability to work without causing adverse effects (harmlessness). A drug that is innocuous is a safe drug. The term is rarely used in practice, see below. The French translation of innocuousness is innocuité. It should be noted that innocuousness and tolerability are not synonyms. A drug may have a bad innocuousness but good tolerability. For example, it could cause side effects in most cases, these effects being well tolerated.

Where things get complicated is when we approach the notion of safety. The French translation of safety is sécurité. Depending on the context, the concept of safety covers a more or less broad semantic landscape. We will translate the expression “safety and tolerability” by “innocuité et tolérabilbilité“. However, in the expression “safety and efficacy”, safety covers both innocuousness and tolerability. Therefore, we will translate the expression into “sécuritéet efficacité“. Note that “safety” is never translated into “sureté”, although “safe” is translated into “sûr”.

Finally, let us agree that while the greatest precision is necessary between health professionals, it must not lead to pedantry that hinders clear and elegant communication with the patient. We will therefore translate our “commitment to ensuring your safety” into our “engagement à garantir votre bien-être” and not “garantir votre sécurité “. Unless the practitioner is also a bodyguard.