Home » Blog » Archives for Alex

Author: Alex

Perpetual Motion Patent Application

Just so that we can remind ourselves that patents are not always what they are cracked up to be, I could not resist throwing light on US 20120161564 A1, which declares itself to be a “Device and Method for Recycling Energy”. Let me quote part of the opening paragraph directly:

“The device includes a battery, motor, and generator. The battery powers an electric motor, which in turn powers the generator, which then simultaneously re-charges the battery and powers external equipment. With this feedback loop, the system sustains itself, while at the same time powering an external device, and requires no fuel or connection to an external power source to recharge the batteries.”

In other words, it’s a perpetual motion machine. A young would-be technologist should be well aware by the age of about 14 that this can not fly. We can rest assured that the application will never be granted – I think the term we would see is something like “conflicts with established physical principles”.

But I can’t help but wonder how it came into being. I looked for clues, like a priority date of April 1, and I put the applicants names into anagram finders to see if they came up with something like “ha ha fooled you”, but failed to find anything. Is it a student prank?

New look website

Refreshing this website, simple as it is, was overdue. Most of my customers, actual or potential, I’m sure, would be looking at the website on some kind of desktop screen, so perhaps being mobile-friendly is not that important, but these days it seems really old-fashioned not to be mobile-friendly. But these changes are always a fiddle, so I had been putting it off.

But with a change of hosting service, moving from Australia to Europe, something went wrong in an admittedly very small corner of the presentation. I decided to bite the bullet: the website is not complicated, by any stretch of the imagination, so it’s now got a new theme, a new look, and been slightly reorganised.

The content is almost unchanged, although I have clarified the best ways of getting in touch with me. The biggest change is to the note on my availability. Years ago I had tried to use a twitter-feed to update it, but for one reason or another that only worked sometimes. At that time it was a “widget” in the right-hand sidebar, and when I abandoned the Twitter-based method I left it where it was. But with the more modern presentation, the “right-hand” sidebar can equally well appear underneath. I’m trying to follow the motto of “don’t make your readers think,” so it now has its own page, and that page has an entry on the menu which is easy to find. You can also directly bookmark that page to use for quick reference, if you like.

Less tangibly, I think it just looks nicer now. The header picture by the way is a Birmingham skyline, and I must confess that I nicked the original from the Birmingham Post and Mail before doctoring it for my purposes. If that august publication has any objection, I’ll find something else!

Counting words or characters (again)

To charge by the word, or by a character-based measure? The “line” of 55 keystrokes is of course the one I use, but there are other systems – it should be possible to convert precisely through a simple multiplication.

Why mention it again? Because on a recent large job I noticed that the translation process sent the word count up by no less than 35%, representing an uncertainty that neither the client nor the translator is likely to be comfortable with. Contrast this with the character-based measure of size. Measured that way, the text also changed size. It shrank. By well under 1%. I suggest that neither the client nor the translator is likely to be troubled by that.

No, I am not happy to charge by the “source word” for unseen texts

I have recently again been asked by an agency for my rate, which they require expressed “per source word”. For those of you who can read German there was a good discussion of this matter at the site of agency Intra. It is no longer hosted there, but I got a copy from the wayback machine, and put it here.

Let me say that I do understand that agencies need to know, in advance, what a freelancer is likely to charge for a given translation. We freelancers are, I think, sensible if we reserve the right to modify this on site of the actual text, just as we really must reserve the right to decline any job when we have actually had an opportunity to see the relevant files. But it is a commercial reality that agencies need to be able to estimate what a particular translator is almost certain to charge for a given job.

There was a time when the publishing world thought of the “word length” of, let’s say, a newspaper article as a purely nominal estimate, based on a double-spaced typed page, with an average of 10 (English) words per line, 30 lines per page. The typewriter wrote at 10 characters per inch, and that was that. Because the actual, precise word count is constantly available from your word processor, people have adopted the idea of using this as a measure of the “size” of a given text. But, as any agency with any experience of German should know, this is a very bad idea when it comes to measuring German. The problems are easy to see.

Firstly, I leave aside the issue of quoting for work simply on the basis of the quantity of text, ignoring questions of complexity, subtlety or style. There are things to be considered there, but for this article I will assume that we are dealing entirely with run-of-the-mill texts, typical for what a particular translator or particular agency handles.

The most obvious point is that German is positively famous for its long collocations, written without spaces to form a single word. People who don’t know the language sometimes think that it is a particular difficulty of German: “oh, all those long, long words.” In fact, of course, it is just something you quickly get used to. Let’s take a couple of examples.

We begin with something very simple: the source text says “Ich bin Alex”, and our translator summons their skill to convert this to English: “I am Alex”. Three words, in each case, and no obvious problem.

Next, we are translating a patent, written in German, for a new and inventive hairbrush. As is typically the case with patents, the document will probably not call it a “hairbrush”, to exclude the possibility that a competitor will come along with something very similar and say: “oh, but ours isn’t a hairbrush, it’s a multi-row comb, so your patent doesn’t cover it.” For this reason they will be likely to have an expression such as, “a hair alignment device, such as a hairbrush or comb”. It sounds comical, but there are good reasons for it. So our German text refers to a “Haarausrichtungsgerät”. As I say, not as strange as it might look, because “Haar” refers to our hair “Ausrichtung” is alignment, and a “Gerät” is a device. (This is, by the way, a joke – as far as I know “Haarausrichtungsgerät” is not a real German word. But it could be!)

Now if we measure these texts by the “target word”, things are bad enough. If we do that, we are saying that the amount of work, skill, and so on involved in translating “hair alignment device” is exactly the same as that involved in translating “I am Alex”. Clearly not true. If, however, we base our charging on the number of characters, we have 21 characters as against 9 characters, and we would be saying that the translation of the long phrase is worth a bit more than twice as much as the short phrase. You might argue that the factories too high, you might argue that it is too low, but at least it is a factor, and it recognises the fact that the first phrase is longer and more complex than the second.

If, however, you want to measure by the “source word”, things are even worse. For the reasons described above, agencies will naturally want to work on the basis of the source text, because they want to know in advance what you are likely to charge. Commercial reality again – in its way, it’s fair enough. But what happens here is that we end up saying that “Haarausrichtungsgerät” is only one word, and therefore that the translation of “I am Alex” is three times as difficult, and justifies a payment three times higher than the translation of “hair alignment device”. This is no longer merely “bad enough” – it is downright ludicrous.

The solution is simple. With one click of the mouse I can see the number of characters in my text. When working in Word it’s at the bottom left-hand corner of the window. “Characters (with spaces)” is what it says. You can use this directly, but the numbers tend to be large. More conveniently, and conventionally, you can divide it by 55 and get the number of “standard lines”. Asking for advanced pricing based on counts of the number of German words is highly unsatisfactory. To my mind it suggests that an agency who wants to do this may be lacking in experience.

Taking about 1 million recently translated words from my database, I see that on average I write almost exactly five English words for every four German words, but the length in characters tends to be usually within 1% or so of the source. In fact my translation is more often marginally shorter than the original than marginally longer. The problem, of course, is not that the average German word is longer than the average English word, but that the spread of lengths is much higher, and that the really long words tend to occur in texts that are, in any case, more complex and difficult to understand and require a more specialised vocabulary to express in English.

 

Why you should use a proper translator

A perfect example from “My Switzerland” and the German version.

Read the first, and you may think that the Swiss are being refreshingly honest when they introduce “The city of Zug, well-known for its steep taxes…”, although you’d think they would at least quietly ignore that flaw. Read the second, and you will hear of “Die Stadt Zug, bekannt durch die tiefen Steuern”. Ahh, “low” taxes, not steep ones!

Machine translation? Or just poor quality? Perhaps the taxes are so very low, they had no money to pay for a proper translator!

I am asking them for comment, and will report.

TAUS, “data-driven translation” and the future

A few weeks ago my attention was drawn to an article at http://www.translationautomation.com/perspectives/translation-leaks.html. In the associated discussion, I said that “it would be easy to pick holes in his largely spurious arguments if I had time”. Someone commented that, “suggesting that you have the capability but not the time to pick apart somebody else’s cogent arguments is a bit like those people who sometimes offer to black one’s eye over the Internet,” so now I have a little time, here are my thoughts. I reproduce the article in full, so that my comments, which are in red, have the full context.

Tuesday, 07 December 2010 15:00 Jaap van der Meer

You trusted your bank. You trusted your currency. You trusted your government. You trusted your translations.

So what happens now? Your certainties are being unraveled one after another. The system you trusted is leaking. It is unsettling. And even scary… But then you realize: trust is good, and knowing the facts is always the best policy.

You trusted your translations, your carefully chosen terminology, and your translation memories built up with great care, and well protected against unauthorized use. And now, you realize that your translations are as good or as bad as any others, that your customers rarely read your translations and often rely on machine-translated texts – machine translations that, annoyingly, are sometimes surprisingly good. (Certainly some, and probably most of my clients do read my texts, in some cases quite carefully. If they don’t read yours, you must be a different branch of the industry.) You realize that others are sharing translation memories, perhaps even your translation memories that you have nurtured as your own assets. You realize now that your secure world of translation is leaking. You are losing control. The model isn’t working anymore. It is upsetting, but it is better to face this new reality. Let go of the illusion of control. (I never did have an illusion of control, thank you. I maintain my translation memories as carefully as I can, but I have sold my translations – it is open, as it always was, to my clients to do with them whatever they want.)

New reality

The world is changing and the translation industry is lagging behind. The industry still operates with a 20th century, western world mindset: the developed world exporting its goods and spreading its civilization to customers rich enough to pay. (That’s what I call business.) Translation is largely one-directional – from English into the major languages – and one translation per language fits all customers. (I don’t have the figures to challenge this, but it strikes me as a surprisingly narrow and extraordinarily Anglo-centric view to be held by somebody who works in the translation field. English may, indeed, be at present one of the world’s most important technical and commercial languages, but surely, by that very token, a huge slice of the translation business involves translating texts into English, and I am quite sure that another huge slice is between languages where neither is English.) Translation is priced by the word (in many cases) and managed in projects (that’s the way business is done). A project is traditionally product documentation, or instructions for use, or the user interface. Each project in principle is meant to increase sales in new markets and is measured by ROI. Efficiencies come from an overly simplistic (I take it that “overly simplistic” is an overblown way – should I say “overlyblown” – of saying “very simple”) technology called translation memory that was invented in the 80’s of the last century and has hardly advanced since then. (I am at a total loss trying to understand why the author regards software of this sort as in some way excessively simple. If he believes that it has not advanced over the last 20 years, I suspect he has not been using it. It has advanced hugely, as any user must surely know.)

Translation in the 21st century requires a very different vision. (I believe we are getting to his main commercial pitch.) Western hegemony is over. Products and services are developed, manufactured and marketed everywhere and anywhere. Customers are more self-confident and don’t read manuals anymore. They read blogs and peer reviews and pull information from customer support sites when they need it. In fact new generation users don’t need user instructions at all and if new products are well designed, they’re completely intuitive and let the users ‘plug and play’. But new generation customers are also more discerning when they buy a product.

In this new regime, translation is multidirectional, from any language into any language. (I think it always was – see above.) Quality requirements are different for different users and different usages. Machine translation is good enough for the largest volumes of dynamic web content (I wonder which language is he has tried this out on. Recently I have had occasion to make considerable use of Google translate for Italian, as I now live in Italy without speaking the language. Italian may be less important than English, but it is scarcely an obscure language. The plain fact is that in most cases the reader is lucky if the sense rises above gibberish. Worse still than being incomprehensible, it is not uncommon for it to be entirely wrong, failing for instance to see negatives or inserting negatives where they do not exist.), whereas pre-sales texts require a step-up in quality from the current one-translation-fits-all policy. (Who uses this policy? I suspect most of my clients have always had a higher standard.) Tuning in to the style and sub-culture of niche customer groups makes all the difference in an increasingly global marketplace. Word-based pricing and ROI measurement do not make much sense in this new economic reality. Translation memory software still serves its goals in the shrinking business of manual translations (Is there any evidence that the market is actually shrinking? I somehow doubt it), but it is totally inadequate for the growing volumes of dynamic content.

A Lesson for Translation in the 21st Century

If you can see even half way through this new reality, your concerns over translation leaks will begin to give way to a growing sense of excitement. Translation is coming out of the dusty libraries. (I think that happened some while ago!) Translation is gaining in relevance and significance as a worldwide service industry with billions of customers. Translation as a feature on every web site and every mobile device is the key to a vital global economy.

It is still unsettling of course when you realize that 90% of the translated words will be generated by machine translation engines, probably at no charge to the end-user. But considering there is a non-stop stream of multimedia information, the translation market will certainly innovate and assert its value in different ways. Naturally there is no need for every business to change and everyone to automate. In fact there will be a growing need for high-quality, tailored translations. But if you are tempted to join the innovation wave, I am sure you will figure out a way to prosper in this rapidly changing environment.

One aspect of the future of translation, however, is easily overlooked: the importance of ‘data’. ‘Data’ replaces the role of ‘translation memories’ as the key to efficiency. A jet engine with a thousand times the power of those 1980s propellers. Data drive  translation engines. Data will control the quality and the efficiency of translation in the future. (This is a highly contentious point, and there is no substantial evidence yet that data-driven translation engines can do an even adequate job, let alone a good one, of anything but the most restricted, narrow types of text with highly concrete references. Perhaps the directions for how to get from Wimbledon to Andover might be translated this way, although I would want to be sure that the source didn’t contain anything “challenging” like “don’t turn left here, even though the sign tells you to”. I concede that data may indeed control the quality of this kind of translation – it is likely to keep it at a low level.) Whoever has access to the data controls the future of translation. Privileged or monopolized access to data will jeopardize the blossoming of a 21st century translation industry. (Jeopardize the author’s project?) Ownership of translation memories – translation data – is therefore an important and sensitive topic of debate. The legal argument will not help us much longer in this age of translation leaks. (I take it that he is trying to preemptively prepare the ground in the hope of deflecting the criticism that much of what TAUS plans may well be in breach of copyright.) Data are mined, scraped, masked, shared and used by everyone from individual translators to large global corporations. Attempting to make a legal case against the unauthorized use of translation data will probably not work. The practical argument is all that counts (he hopes), and once translations are published, there is no way to control the leaks (he hopes). And to be honest, wouldn’t you rather turn the whole argument round? If your translations are not confidential, why not simply share these data with everyone who can use them to improve the efficiency and quality of translation as a whole. What stops you from doing this? (What stops me? Professional discretion, for a start, even if I haven’t signed a nondisclosure agreement (NDA). I don’t know quite how typical my position is, but I do know that it is not unusual. As a freelance translator, I have sold my translations. I feel that I have the right to recall my past translations, and to use technology to assist me in that task, and in that way to improve my future translations. But it seems to me quite clear that, at least in the vast majority of cases, copyright was assigned to the client. Suppose that a German nut manufacturer wrote some brilliant advertising copy, and imagine that, after my attentions, the English version was still brilliant. Let’s imagine that the two were published side by side on the client’s website. Now let’s suppose that a German bolt manufacturer sees this and thinks “Wunderbar, ve kann kopy zis text, und verr ze nut makers haff written “nut”, ve kann write “bolt”, und ve haff ze über-brilliant advertising kopy for ten cents only und ve kann sell our bolts to ze British and ze Amerikans!”. Clearly there would be a problem, and it seems to me obvious that it would be the nut-makers who would have a case against the bolt makers for breach of copyright. My translation was sold, and it was up to them to do whatever they like with it. Now over the years I have worked for over 50 clients, and since many of those have been agencies, there must have been several hundred end-clients. If I were to share my translation memories with TAUS I would therefore need to get written permission from hundreds of owners, many of whom, for obvious commercial reasons, are unknown to me. And that is true even for those where I have not signed any sort of NDA. I wonder to what extent TAUS is simply hoping that nobody will ever notice how much copyright material is present in the database they are assembling.

Oh, yes, there is another reason that stops me from doing this. Believe it or not, rather than paying me a substantial sum for joining them and sharing my translation memories, they want me to pay them. Work that one out!)

Wishing you Wisdom (I think he means “Wishing you will come over to my point of view”)

Here we are, at the start of the second decade of a new millennium, facing some real dilemmas. I know it’s hard to take such a radical step from one world into another. But be aware that while you are puzzling over which direction to go, your translations are leaking.

We wish you wisdom and success in 2011 and the decade ahead. We at TAUS are here to help as the industry think tank (or so they want us to believe), your innovation partner and a safe harbor for sharing your translation memories.


REGISTER FOR WEBINAR: TRANSLATION IN THE 21st CENTURY

8am PDT / 5pm CET
Wednesday 15 December
45-minutes in duration

We share insights on the future of the translation industry. Content is based on a number of market and collective intelligence exercises undertaken by TAUS during 2010. This includes continuous review of the market, ideation sessions with major translation decision makers, and discussion with leading scientists, amongst others

(I included this advert from the bottom of the webpage, as an example of the language used by these language professionals. “Ideation sessions.” Good grief.)