Tag Archives: Linux

On the road to an Indian language GNU/Linux OS

The history of Indic localization of Linux (GNU/Linux …) may never be written down with the amount of detail that it deserves. Especially to ensure that the significant events are well recorded.

Lest we forget, with the help of sayamindu and karunakar I came across the following dates:

  • IndLinux Hindi v0.37 (Milan) released on October 2003. It was a LiveCD that allowed you to check out a localized GNU/Linux desktop environment, input and display
  • The AnkurBangla LiveCD was also released somewhat later that same year
  • From May 2004 onwards, the Utkarsh Project released and maintained Gujarati Localizations
  • In November 2004, Fedora Core 3 shipped and you could choose to boot into an Indic locale and get your work done. You can also check this page for the RHEL release that happened.
  • In 2006, the IndLinux project released Rangoli
  • In June 2007, Debian Etch was released with an installer localized in Indic languages (Bengali, Gujarati, Hindi, Malayalam, Nepali, Punjabi, Tamil). As a consequence, Debian (and Ubuntu) users may experience a full Indic-localized system from scratch.

So, before you go and listen to folks talk about the “first Indian language GNU/Linux Operating System” on the media channels, keep these dates in mind.

Update: This does not in any way claim to be the only dates that are relevant. So, if you do recall the dates of releases from other groups working on Indic L10n on Linux, please feel free to leave them in comments with URLs if possible.

The post is brought to you by lekhonee-gnome v0.9

Tools of the translation trade

I begin with a caveat – I am a dilettante translator and hence the tools of my trade (these are the tools I have used in the past or, use daily) or, the steps I follow might not reflect reality or, how the “real folks” do translation. I depend to a large extent on folks doing translation-localization bits for my language and, build heavily on their works.

KBabel

I used it only infrequently when it was around in Fedora (it is still available in Red Hat Enterprise Linux 5) but once I did get over the somewhat klunky interface, it was a joy to work with. Seriously rugged and, well formed into the ways of doing translations, KBabel was the tool of choice. However, it was replaced by Lokalize (more on that later) and so, I moved on to Lokalize.

Lokalize

This has so much promise and yet, there is so much left to be desired in terms of stability. For example, a recent quirk that I noticed is that in some cases, translating the files using Lokalize and, then viewing it using a text editor shows the translated strings. However, loading them in KBabel or, another tool shows the lines as empty. The Kbabel -> Lokalize transformation within KDE could have perhaps done with a bit of structured requirements definition and, testing (I am unaware as to whether such things were actually done and, would be glad to read up any existing content on that). Then there’s this quirk for the files in the recent GNOME release – copying across the content when it is in the form Address leaves the copied form as empty space. The alternative is to input the tags again. Which is a cumbersome process. There are a number of issues reported against the Lokalize releases which actually gives me enough hope, because more issues mean more consumers and hence a need to have a stable and functional application.

Virtaal

I have used it very infrequently. The one reason for that is that it takes some time to get used to the application/tool itself. I guess sometimes too much sparseness in UI is a factor in shying away from the tool. The singular good point which merits a mention is the “Help” or, documentation in Virtaal – it is very well done and, actually demonstrates how best to use the application for day to day usage in translation. This looks to be a promising tool and, with the other parts like translation memory, terminology creator etc tagged on, it will have the makings of a strong toolchain

Pootle

I had been initially reluctant to use a web-based tool to do translations. This however might have been a factor of the early days of Pootle. With the recent Pootle releases, having a web-based translation tool is a good plus. However, it isn’t without its queer flaws – for example, it doesn’t allow one to browse to a specific phrase to translate (or, in other words, in a 290 line file, if you last left it at 175, the choices are either to traverse from the start in bunches of 10 or, 7 or, traverse from the end till one reaches the 176th line), the instances of Pootle that I have used don’t use any translation memory or, terminology add-ons to provide suggestions.

I have this evolving feeling that having a robust web-based tool would provide a better way of handling translations and, help manage content. That is perhaps one of the reasons I have high expectations from the upcoming Pootle releases and, of course, Lotte.

Irrespective of the tools, some specific things that I’d see being handled include the following. I hope that someone who develops tools to help get translations done takes some time out to talk with the folks doing it daily to understand the areas which can do with significant improvements.

  • the ability to provide a base glossary of words (for a specific language) and, the system allowing it to be consumed during translation so as to provide a semblance of consistency
  • the ability to take as input a set of base glossaries across languages (for example, a couple of Indic languages do check how other Indic languages have handled the translation) and, the system allowing the translator/reviewer to exercise the option of choosing any of the glossaries to consult
  • provide robust translation suggestions facilitating re-use and, increasing consistency
  • a higher level of handling terminology than what is present now
  • a stronger set of spell checking plumbing
  • store and display the translation history of a file
  • the ability to browse to a specific string/line which helps a lot when doing review sprints or, just doing translation sprints

Update: Updated the first line to ensure that it isn’t implied that these are the only tools anyone interested in translation can use. These are tools I have used or, use daily.

Update: Updated the “wish-list” to reflect the needs across tools as opposed to the implied part about they being requested only in Pootle

That lazy,hazy,crazy bug of summer

There was this niggling blog entry which pointed to a bug. And, what a bug it was. A problem that is somewhat well entrenched when it comes to Bengali (India) bits is that of pluralism. There are far too many stakeholders and, a larger number of dispersed data points that need to be lined up before a conclusion can be arrived at. Especially so, because over the years there has been enough discussion about divergence of views on the data points rather than covergence of opinion leading to closing of tasks.

So, Runa poked and prodded a few folks. And, got things done. That’s awesome and that’s what makes her a rockstar.

It only gets funnier…

I had mentioned Baishakhi Linux in passing in an earlier post. So, today a few of us received an interesting mail from Prof Anupam Basu of IIT-Kharagpur. The original mail is here for ready reference as it is at here. And I have a few points to make which are below. I will abbreviate Prof Anupam Basu as PAB and my responses as SM.

However, at the outset, let me state that it seems that the team working on the project can do well to buy copies of Karl Fogel’s excellent work Producing Open Source Software.

PAB: The release was on the 8th of September and it was planned to make the source code
available upstream, within a few days as is expected of open-source activities.
Please keep your weather eyes open on the SNLTR site.

SM: I recall asking twice about this and, this is the first time we hear about plans
to make source code available upstream. However, it isn't upstream source code that I'd be
interested in. I'd be interested in having access to the source of the distribution itself.

PAB: Mr. Toshi Kubota (mail cc'd to him) is a pioneer in Open-Source and Linux related
movements in Japan. We will ensure that all gpl requirements are adhered to in accordance
with his guidance. Please note that the inauguration was only day before yesterday !!!!

SM: This is what is generally called a straw-man argument. If you note, at no point have
any aspersions been cast on the level of expertise, or, competence of Mr Toshi Kubota.
Instead, what has been asked is a simple question - why have the translator credits been
mangled in the headers (that have been obtained from the .mo files) ? For instance, I don't
recall Promathesh Mondal following any of these steps.

PAB: I know ( because Indranil Dasgupta himself told me on the first day we were discussing
and demonstrating a prelim version of Baishakhi Linux) that the existing Linux versions,
printout of complex Bengali scripts through Firefox was not possible - thanks to Indranil -
this problem is not there in Baishakhi Linux.

SM: If you could point to a suitable sample of such complex scripts I'd be more than glad to
test it out on Fedora9 or rawhide -> Fedora10. As on date, I am yet to see on the Firefox
bugzilla a bug from your team that provides specimen cases that drive home the point that the
complex script rendering is an issue on Linux.

PAB: The contributions may be incremental, (as suggested by Sayamindu - a one line code), but
that is there now. Baishakhi Linux need not make tall claims, it was a very low budget effort
and the spirit should be to contribute more by pointing out bugs and improving on it.

SM: I read two things from this: [i] the distribution does include the one line code that
Sayamindu points out ie. it isn't materially new and [ii] none of the fixes that Baishakhi
claims to be 'features' have the patches attached to the already existing bugs in upstream
bugzilla(s).

PAB: I am pained because my invitations to some of the Open-Source groups to join hands in
these activities was met with absolute silence.

SM: 'Let us be objective first' (quoting you) - who are these 'some of the Open-Source groups' ?
And, when you invited them, did you point out the mode to join and contribute ?

In short Prof Anupam Basu, we met around a year plus back at the ‘Bangla in e-Governance’ meeting and even then I had taken time to ask you to work with our group to push all the work you do into upstream. Not surprisingly, it hasn’t happened. I don’t expect it to happen as well – but your mail was something of a surprise. More so, since you chose to include Sayamindu’s mother in cc: – what were you thinking ? 🙂

Sayamindu has a response as well.