A perfect Babelfish in 2026? No way

bodyguard-247682_960_720Under the headline ‘The language barrier is about to fall’, Alec Ross in yesterday’s Wall Street Journal makes a strong claim: ‘In 10 years, a small earpiece will whisper what is being said to you in your native language nearly simultaneously as a foreign language is being spoken.’ So the Babelfish will finally spread from the Hitchhikers’ Guide to the Galaxy to the real world.

But I think Ross is wrong, for four reasons.

1. The prediction has been around for over half a century without coming true. Machine translation is one of the holy grails of technological development, up there with nuclear fusion (with electricity ‘to cheap to meter’, a promise from 1954) and autonomous cars – I remember how I first heard about those in my 1970s primary school, and the idea wasn’t new even then. Of course, none of this disproves Ross’s claim, but it does justify a certain scepticism.

2. Computers have a hard time interpreting the human voice, and speech recognition software always needs a training period to get familiarised with any specific user. Under suboptimal conditions – noisy surroundings, sloppy or emotional deliverance – it has even more trouble converting our sounds into words. On the other hand, this technology has advanced considerably in recent years, so perhaps – I’m not a good judge of that – it only takes increasingly smart algorithms and more brute computing force to perfect this particular skill required from a Babelfish.

3. The quality of current machine translation is mostly horrible, in spite of frequent claims to the contrary. I am an almost daily user of Google Translate, and though my expectations are low, even so I find the results underwhelming 8 times out of 10. One particularly exasperating example is what it does to names, such as cities: instead of just leaving them as they are, it tends to replace them, turning ‘Olomouc’ into ‘Birmingham’ and ‘Esperanto’ into ‘English’. Sometimes I can understand why, but more often I’m just flummoxed.

Only yesterday, I had GT translate a Dutch-language blogpost into Spanish. With Spanish being the third largest language online and Dutch the 12th, this can hardly be considered an exotic pair of languages. Nevertheless, the result was grade A gobbledygook. The English translation, I should add, was reasonable, though probably not as good as any Dutch student taking their O levels could produce. GT has the word ‘Translation’ in its name for a reason: it is currently little more than a (rather poor) device for translating into and out of English.

Translating, mind you – not interpreting. We feed it strictly written language. The verbal jazz that we call ‘ordinary speech’, full of unfinished sentences, repetitions and filler words, would send GT into a fit. This suggests that Google Interpreter, the device predicted by Ross, is very far off indeed.

4. The main reason for my disbelief is a fundamental flaw in translating software: it processes language statistically, and statistically only. It compares all input with its database, looks for the best and most frequent matches, and thus chooses the output most likely to be correct. It works – but only so far.

And nor can it get farther, because it fails to do two crucial things: parsing and comprehending. Computers don’t understand grammar and they are clueless about human experience of reality. As long as that doesn’t change, they will get nowhere near the capabilities of translators and interpreters of flesh and blood and brain. A machine that translates the Dutch word for ‘a bet’ (weddenschap) into a Spanish verb meaning ‘he bet’, is a grammatical idiot. A machine that can’t pick the right meaning of prijs (‘price’ or ‘prize’) is a moron – in a context, any reader can do it in a split second.

In my experience, GT has for several years now been more or less stagnant at its current level, which I would describe as ‘helpful hallucination’. I am convinced that it will not substantially improve without a grasp of grammar and some worldly wisdom. And as for the article in the Wall Street Journal: barring a scientific breakthrough, the same prediction can and probably will be made again in ten years’ time.


