Do we need to look for new software ?

In an unguarded moment of misguided enthusiasm (and, there is no other way to put it) I volunteered to translate a couple of my favorite TED talks. The idea was simple – challenging myself enough to learn the literary side of translating whole pieces of text would allow me to get to the innards of the language that is my mother tongue and, I use for conversation. Turns out that there was an area that I never factored in.

Talks have transcripts and, they are whole blocks of dialogue which have a different feel when undergoing translations than the User Interface artifacts that make of the components of the software I translate. In some kind of confusion I turned to the person who does this so often that she’s real good at poking holes in any theory I propound. In reality, it was my turn to be shocked. When she does translations of documents, Runa faces problems far deeper than what I faced during the translation of transcripts. And, her current toolset is woefully inadequate because they are tuned to the software translation way of doing things rather than document/transcript/pieces of text translation.

In a nutshell, the problem relates to the breaking of text into chunks that are malleable for translation. More often than not, if the complete text is a paragraph or, at least a couple of sentences – the underlying grammar and the construction are built to project a particular line of thought – a single idea. Chunking causes that seamless thread to be broken. Additionally, when using our standard tools viz. Lokalize/KBabel, Virtaal, Lotte, Pootle, such chunks of text make coherent translation more difficult because of the need to fit things within tags.

Here’s an example from the TED talk by Alan Kay. It is not representative, but would suffice to provide an idea. If you consider it as a complete paragraph expressing a single idea, you could look at something like:

So let's take a look now at how we might use the computer for some of this. And, so the first idea here is just to how you the kind of things that children can do. I am using the software that we're putting on the 100 dollar laptop. So, I'd like to draw a little car here. I'll just do this very quickly. And put a big tire on him. And I get a little object here, and I can look inside this object. I'll call it a car. And here's a little behavior car forward. Each time I click it, car turn. If I want to make a little script to do this over and over again, I just drag these guys out and set them going.

Do you see what is happening ? If you read the entire text as a block, and, if you are grasping the idea, the context based translation that can present the same thing lucidly in your target language starts taking shape.

Now, check what happens if we chunk it in the way TED does it for translation.

So let's take a look now at how we might use the computer for some of this.

And, so the first idea here is

just to how you the kind of things that children can do.

I am using the software that we're putting on the 100 dollar laptop.

So, I'd like to draw a little car here.

I'll just do this very quickly. And put a big tire on him.

And I get a little object here, and I can look inside this object.

I'll call it a car. And here's a little behavior car forward.

Each time I click it, car turn.

If I want to make a little script to do this over and over again,

I just drag these guys out and set them going.

Get them out of context and, it does make threading the idea together somewhat difficult. At least, it seems difficult for me. So, what’s the deal here ? How do other languages deal with similar issues ? I am assuming you just will not be considering the entire paragraph, translating accordingly and then slicing and dicing according to the chunks. That is difficult isn’t it ?

On a side note, the TED folks could start looking at an easier interface to allow translation. I could not figure out how one could translate and save as draft, and, return again to pick up from where one left off. It looks like it mandates a single session sitdown-deliver mode of work. That isn’t how I am used to doing translations in the FOSS world that it makes it awkward. Integrating translation memories which would be helpful for languages with substantial work and, auto translation tools would be sweet too. Plus, they need to create a forum to ask questions – the email address seems to be unresponsive at best.

10 thoughts on “Do we need to look for new software ?”

  1. The way TED is handling translations is obviously completely broken, I wonder if whoever came up with that system actually did any form of translation, ever. The best way to translate that kind of text is to fire up a text editor and progressively (in linear sequence) replace the original with the translated text. At least that’s how I (a complete amateur at translations, though I occasionally have to translate something between 2 of the 4 languages I speak) translate that kind of text. Tools are just making things harder for continuous texts, and chunking is artificial, pointless and counterproductive.

    Reply

  2. There’s no way to perform a decent translation sentence-by-sentence on such text. We translate documentation paragraph-by-paragraph and I don’t see any reason to no do so with the transcripts.

    Reply

    sankarshan Reply:

    Precisely. Which means that one has to junk the TED provided interface, use a standard text editor to translate. This of course brings up a different question – how do I then shove the sentences in according to the English ones on the TED UI ? 🙂

    Reply

  3. translation can not be done para-by-para also. You have to read the whole thing, get inside the mind of the author and progressively churn out a translation. This requires several passes through the *whole* document. That is the only way a consistent translation can be done.

    Reply

    sankarshan Reply:

    I agree with you. It is impossible to just read parts of text and then translate. The idea of a piece of text is that the author is trying to project a central idea – constant and repeated reading of the text allows the idea to be formed in the mind of the translator. This is opposed to the translator interpreting the idea and trying to translate.

    My concern is the way the TED folks are asking the translation to be done. For example, this is more of subtitle translation which, I think, is a new skill one would need to acquire. At least I’d need to.

    Reply

  4. Exactly. I did not want to put out sentence segmentation and lack thereof as the problem. In effect, the TED folks require the subtitles to be translated. Which is different from pure application software or, website or even document translation.

    I am a dilettante when it comes to translations and am game for acquiring new skills. I’d love to learn from those who translate sub-titles as to how it is done. Although, I am somewhat certain that the software backend being used by TED isn’t the most optimal piece.

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.