Home » Blog » TAUS, “data-driven translation” and the future

TAUS, “data-driven translation” and the future

A few weeks ago my attention was drawn to an article at http://www.translationautomation.com/perspectives/translation-leaks.html. In the associated discussion, I said that “it would be easy to pick holes in his largely spurious arguments if I had time”. Someone commented that, “suggesting that you have the capability but not the time to pick apart somebody else’s cogent arguments is a bit like those people who sometimes offer to black one’s eye over the Internet,” so now I have a little time, here are my thoughts. I reproduce the article in full, so that my comments, which are in red, have the full context.

Tuesday, 07 December 2010 15:00 Jaap van der Meer

You trusted your bank. You trusted your currency. You trusted your government. You trusted your translations.

So what happens now? Your certainties are being unraveled one after another. The system you trusted is leaking. It is unsettling. And even scary… But then you realize: trust is good, and knowing the facts is always the best policy.

You trusted your translations, your carefully chosen terminology, and your translation memories built up with great care, and well protected against unauthorized use. And now, you realize that your translations are as good or as bad as any others, that your customers rarely read your translations and often rely on machine-translated texts – machine translations that, annoyingly, are sometimes surprisingly good. (Certainly some, and probably most of my clients do read my texts, in some cases quite carefully. If they don’t read yours, you must be a different branch of the industry.) You realize that others are sharing translation memories, perhaps even your translation memories that you have nurtured as your own assets. You realize now that your secure world of translation is leaking. You are losing control. The model isn’t working anymore. It is upsetting, but it is better to face this new reality. Let go of the illusion of control. (I never did have an illusion of control, thank you. I maintain my translation memories as carefully as I can, but I have sold my translations – it is open, as it always was, to my clients to do with them whatever they want.)

New reality

The world is changing and the translation industry is lagging behind. The industry still operates with a 20th century, western world mindset: the developed world exporting its goods and spreading its civilization to customers rich enough to pay. (That’s what I call business.) Translation is largely one-directional – from English into the major languages – and one translation per language fits all customers. (I don’t have the figures to challenge this, but it strikes me as a surprisingly narrow and extraordinarily Anglo-centric view to be held by somebody who works in the translation field. English may, indeed, be at present one of the world’s most important technical and commercial languages, but surely, by that very token, a huge slice of the translation business involves translating texts into English, and I am quite sure that another huge slice is between languages where neither is English.) Translation is priced by the word (in many cases) and managed in projects (that’s the way business is done). A project is traditionally product documentation, or instructions for use, or the user interface. Each project in principle is meant to increase sales in new markets and is measured by ROI. Efficiencies come from an overly simplistic (I take it that “overly simplistic” is an overblown way – should I say “overlyblown” – of saying “very simple”) technology called translation memory that was invented in the 80’s of the last century and has hardly advanced since then. (I am at a total loss trying to understand why the author regards software of this sort as in some way excessively simple. If he believes that it has not advanced over the last 20 years, I suspect he has not been using it. It has advanced hugely, as any user must surely know.)

Translation in the 21st century requires a very different vision. (I believe we are getting to his main commercial pitch.) Western hegemony is over. Products and services are developed, manufactured and marketed everywhere and anywhere. Customers are more self-confident and don’t read manuals anymore. They read blogs and peer reviews and pull information from customer support sites when they need it. In fact new generation users don’t need user instructions at all and if new products are well designed, they’re completely intuitive and let the users ‘plug and play’. But new generation customers are also more discerning when they buy a product.

In this new regime, translation is multidirectional, from any language into any language. (I think it always was – see above.) Quality requirements are different for different users and different usages. Machine translation is good enough for the largest volumes of dynamic web content (I wonder which language is he has tried this out on. Recently I have had occasion to make considerable use of Google translate for Italian, as I now live in Italy without speaking the language. Italian may be less important than English, but it is scarcely an obscure language. The plain fact is that in most cases the reader is lucky if the sense rises above gibberish. Worse still than being incomprehensible, it is not uncommon for it to be entirely wrong, failing for instance to see negatives or inserting negatives where they do not exist.), whereas pre-sales texts require a step-up in quality from the current one-translation-fits-all policy. (Who uses this policy? I suspect most of my clients have always had a higher standard.) Tuning in to the style and sub-culture of niche customer groups makes all the difference in an increasingly global marketplace. Word-based pricing and ROI measurement do not make much sense in this new economic reality. Translation memory software still serves its goals in the shrinking business of manual translations (Is there any evidence that the market is actually shrinking? I somehow doubt it), but it is totally inadequate for the growing volumes of dynamic content.

A Lesson for Translation in the 21st Century

If you can see even half way through this new reality, your concerns over translation leaks will begin to give way to a growing sense of excitement. Translation is coming out of the dusty libraries. (I think that happened some while ago!) Translation is gaining in relevance and significance as a worldwide service industry with billions of customers. Translation as a feature on every web site and every mobile device is the key to a vital global economy.

It is still unsettling of course when you realize that 90% of the translated words will be generated by machine translation engines, probably at no charge to the end-user. But considering there is a non-stop stream of multimedia information, the translation market will certainly innovate and assert its value in different ways. Naturally there is no need for every business to change and everyone to automate. In fact there will be a growing need for high-quality, tailored translations. But if you are tempted to join the innovation wave, I am sure you will figure out a way to prosper in this rapidly changing environment.

One aspect of the future of translation, however, is easily overlooked: the importance of ‘data’. ‘Data’ replaces the role of ‘translation memories’ as the key to efficiency. A jet engine with a thousand times the power of those 1980s propellers. Data drive  translation engines. Data will control the quality and the efficiency of translation in the future. (This is a highly contentious point, and there is no substantial evidence yet that data-driven translation engines can do an even adequate job, let alone a good one, of anything but the most restricted, narrow types of text with highly concrete references. Perhaps the directions for how to get from Wimbledon to Andover might be translated this way, although I would want to be sure that the source didn’t contain anything “challenging” like “don’t turn left here, even though the sign tells you to”. I concede that data may indeed control the quality of this kind of translation – it is likely to keep it at a low level.) Whoever has access to the data controls the future of translation. Privileged or monopolized access to data will jeopardize the blossoming of a 21st century translation industry. (Jeopardize the author’s project?) Ownership of translation memories – translation data – is therefore an important and sensitive topic of debate. The legal argument will not help us much longer in this age of translation leaks. (I take it that he is trying to preemptively prepare the ground in the hope of deflecting the criticism that much of what TAUS plans may well be in breach of copyright.) Data are mined, scraped, masked, shared and used by everyone from individual translators to large global corporations. Attempting to make a legal case against the unauthorized use of translation data will probably not work. The practical argument is all that counts (he hopes), and once translations are published, there is no way to control the leaks (he hopes). And to be honest, wouldn’t you rather turn the whole argument round? If your translations are not confidential, why not simply share these data with everyone who can use them to improve the efficiency and quality of translation as a whole. What stops you from doing this? (What stops me? Professional discretion, for a start, even if I haven’t signed a nondisclosure agreement (NDA). I don’t know quite how typical my position is, but I do know that it is not unusual. As a freelance translator, I have sold my translations. I feel that I have the right to recall my past translations, and to use technology to assist me in that task, and in that way to improve my future translations. But it seems to me quite clear that, at least in the vast majority of cases, copyright was assigned to the client. Suppose that a German nut manufacturer wrote some brilliant advertising copy, and imagine that, after my attentions, the English version was still brilliant. Let’s imagine that the two were published side by side on the client’s website. Now let’s suppose that a German bolt manufacturer sees this and thinks “Wunderbar, ve kann kopy zis text, und verr ze nut makers haff written “nut”, ve kann write “bolt”, und ve haff ze über-brilliant advertising kopy for ten cents only und ve kann sell our bolts to ze British and ze Amerikans!”. Clearly there would be a problem, and it seems to me obvious that it would be the nut-makers who would have a case against the bolt makers for breach of copyright. My translation was sold, and it was up to them to do whatever they like with it. Now over the years I have worked for over 50 clients, and since many of those have been agencies, there must have been several hundred end-clients. If I were to share my translation memories with TAUS I would therefore need to get written permission from hundreds of owners, many of whom, for obvious commercial reasons, are unknown to me. And that is true even for those where I have not signed any sort of NDA. I wonder to what extent TAUS is simply hoping that nobody will ever notice how much copyright material is present in the database they are assembling.

Oh, yes, there is another reason that stops me from doing this. Believe it or not, rather than paying me a substantial sum for joining them and sharing my translation memories, they want me to pay them. Work that one out!)

Wishing you Wisdom (I think he means “Wishing you will come over to my point of view”)

Here we are, at the start of the second decade of a new millennium, facing some real dilemmas. I know it’s hard to take such a radical step from one world into another. But be aware that while you are puzzling over which direction to go, your translations are leaking.

We wish you wisdom and success in 2011 and the decade ahead. We at TAUS are here to help as the industry think tank (or so they want us to believe), your innovation partner and a safe harbor for sharing your translation memories.


8am PDT / 5pm CET
Wednesday 15 December
45-minutes in duration

We share insights on the future of the translation industry. Content is based on a number of market and collective intelligence exercises undertaken by TAUS during 2010. This includes continuous review of the market, ideation sessions with major translation decision makers, and discussion with leading scientists, amongst others

(I included this advert from the bottom of the webpage, as an example of the language used by these language professionals. “Ideation sessions.” Good grief.)

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *