अनिल एकलव्य ⇔ Anil Eklavya

February 6, 2010

The Elite Strikes Back, Fetishiously

From right after the transfer of power from the British to the local English Elite (the Babus in the broadest sense), one recurrent theme in the Indian ‘National’ press, which translates as the English press, has been to come down like a 16 ton weight on anyone who so much as mentioned the case of the Indian languages and the extraordinary privileges enjoyed by the English speaking Elite in the country. So, for example, if any politician of the Hindi belt suggested that students should be allowed to write some important exam in Indian languages or that English should not be compulsory at the primary level or even something much less radical-revolutionary and world shaking, there would be (without fail) editorials in the ‘National’ newspapers about how the language chauvinists are going to lay waste our great democracy.

With the changes that have happened in the last 15 years or so (some for better and more for worse), this trend became less common. But now the lumpen antics of the Thackerays have given the Elite a golden opportunity to come back with a 32 (or is it 64?) ton weight on the ‘language chauvinists’.

The way the Thackerays have been able to carry on their thuggery (in the Hindi as well as the English sense of the term) is so absurd that only a few things can compete with it. And one of those things is the fact that the English Elite of the country have been so amazingly successful in summarily suppressing all Indian languages including the legally National Language (Hindi), the language that has the most chauvinistic support from its speakers (Tamil) and the language of the most intellectual community of the sub-continent (Bengali). These and many others are not endangered languages (at least not yet). Most of them can be called mega languages in terms of the number of speakers. All of this is so well known and so often repeated that I feel weary of having to write this. Also equally well known is the fact that only a very small fraction of the Indian population is comfortable with English. However, as India is a society whose structure is mainly defined by the caste system, no one except the top caste wants to remain in their own caste. They all want to make the transition to the higher castes, even as they list the reasons for the greatness of their caste. And the highest caste now effectively is that of the English speakers, who have replaced the (literal) Brahmins from their perch at the top (I know, ‘replaced’ is not a good term because a large fraction of the Elite is Brahmin). Naturally then everyone wants ultimately to make the transition to the top caste. This has lead to an extremely comic and absurd fetish about any language anywhere in the world. It is the fetish for the English language. This fetish too is a well known, though rarely talked about in the English media. A recent issue of the Outlook magazine was an exception. (The issue was the exception, not the magazine). The ‘language media’, of course, used to talk about it. Innumerable books have been written about it. Movies have been made about it (a recent one being Tashan, one of whose stars is now living out his character’s fetish in the real world). And sometimes politicians have talked about it for electoral purposes. But most of them have learned that it doesn’t pay much as the Indians (especially the North Indians) are not very keen to be seen speaking their own languages when in respectable company. They don’t even want it to be known to anyone that they are not good at English. Parents who can’t speak the language will parade their English learning children in front of any visitor and have a little performance of nursery rhymes being chanted in English, even if the visitor as well as the child feel tortured. They will also mention with pride that their child is very poor in Hindi (or any other Indian language).

It’s not that no one in the English speaking community has noted this. Even Nayantara Sehgal had mentioned this in one of her novels long ago. More recently Arundhati Roy had written about the oustee villagers from the Narmada dam site being scolded by Maneka Gandhi for not writing their petition in English, after they had travelled all the way, enduring hardship and hoping to save their lives. There have been others like Namita Gokhale among the (English speaking) writers and artists who have at least hinted at the absurdity of the situation.

But, by and large, the Elite has managed to suppress all talk about any fairness with regard to Indian languages which account for the overwhelming majority of the population of India. They have used diversity as an argument for maintaining the hegemony of English. They have used chauvinism as an argument. They have pitted one big language (Tamil) against the other (Hindi). They have pitted small languages (the so called dialects of Hindi) against big languages. They have pitted Dalits against the upper castes: no matter that most of them belong to the upper castes themselves. They have used linguistically spurious claims about the superiority of English over the ‘less developed’ Indian languages. They have steadfastly refused to concede even a pinhead worth of territory to the Indian languages.

Talk of divisiveness.

Unfortunately for them, The Market (whose praise they are now singing, be they from any part of the political spectrum) may be a brutal place, but it has allowed the Indian languages to gain some territory. As had the linguistic reorganisation of the states, which also (like the demands for linguistic fairness, not like The Market) they have always kept riling against.

When Pepsi and the others came after The Reforms, they didn’t give a damn about what language can get them more customers. Before that, big companies in India preferred to make commercials in English, unless their product was some low brow thing that no one would want to talk about. It is understandable why: the top advertising agencies are mostly dominated by the elitest of the Elite. It must have been hard for them to get used to the presence of Indian languages in their midst. To give the devil his due, they have managed the transition quite well, at least on the public front. It has turned out that these underdeveloped languages can be used ‘creatively’ after all, whatever may be the purpose. I don’t know what to feel about this.

The people may be ashamed of their own languages and of being seen reading books in them (chauvinism indeed!), but they are hooked to the movies and T.V. serials in those same languages. The movie scene is not any less hilarious either. The people involved in these movies may be making their career, earning huge amounts of money and generally being the gods of urban life in India (along with the cricket stars) through Indian languages, but they too are equally ashamed of the languages they make movies in. The scripts of Bollywood movies are written using the Latin alphabet. More than one big Bollywood Hindi movie star has been on record saying he hates Hindi. One of them said he didn’t want anyone around him speaking in Hindi. Offscreen, all they want is for their lives to be copies of Hollywood stars. And they are prepared to pretend that their mediocre work in ‘foreign’ English movies (to the extent they get such work, the chances of which are increasing now as the real superpower focuses a little bit more of its attention eastward this side) is by far better than their best work in Hindi movies. They will tell you the reason for this too: English movies give them far more exposure than Hindi movies (if they do, what does quality matter?). As for the criticism which suggests otherwise, well, ‘it will die its own death’.

Another of the cards the Elite uses against any demand for linguistic fairplay is that of communalism. The fact that the Jan Sangh/BJP and the Sangh Parivar in general have been shouting the slogan of ‘Hindi, Hindu, Hindustan’ has been used time and again to put down (and discredit) any such demand. This time they are vehemently talking about how the ‘Hindi fetish’ of the BJP and the Sangh Parivar has brought about the Thackerays’ Marathi version of the same. One of them has grudgingly noted, though, that there are differences between the two.

The only part of the slogan in which the BJP and the Sangh Parivar are interested in is the Hindu part, and they have made a travesty of even that. The preferred name for India for them is Bharat, not Hindustan. India is referred to as Hindustan (or Hindostan) more in the Urdu literature than in the Hindi literature or in the literature of these right wingers.

As a person whose mother tongue is Hindi (standard Hindi, Khari Boli) and who wants to write in Hindi, I refuse to surrender all the rights of this language or the terms Hindustan, Parivar, Sangh (or even Hindu) etc. to the Sangh Parivar conglomeration. The Elite has done its best to give exclusive rights for all these to the conglomeration. I keep the rights to these as an individual, not as a member of a group. I also keep the rights to contribute and participate as an individual, without being a member of any group.

The plain fact is that injustices are committed on a large scale every day in this huge country in the name of languages. However, there can be no doubt that the largest number of these injustices are in the name of English. Time and again I have seen (first hand) how careers of even brilliant students go the steep downward path because they are not so good at English. And careers are a just small part of the picture. If you are involved in a court case, you are unlikely to be heard if you use an Indian language.

I am not talking about a polish person’s case not being heard properly in France because he can’t talk in French. Even that, as a lot of the members of the Elite perhaps know, can be a valid grievance.

The plain fact is also, as a prominent Hindi writer said in an interview on Doordarshan, that ‘we’ (the people talking about the Indian languages) have accepted English as an Indian language and as our own: the question is whether ‘you’ (the English Elite) are prepared to accept the Indian languages as Indian and as your own.

She said this when the first great lit-fest was held a few years ago at a former royal palace near Jaipur where the guest of honour was V. S. Naipaul, who came with all his knightly glory. And where hardly any Indian language author was invited.

If you don’t listen to people like her, then some day you might have to listen to people like the Thackerays. And you might have to pretend that you like what they are saying.

Another plain fact is that most of the mainstream literary writers in Indian languages (whatever might be their other shortcomings) are neither chauvinists nor communalists. In fact, they are the most committed opponents of the right wing politics of the BJP and the Sangh Parivar. And hardly any of them has ever been able to survive from literary writing alone, except perhaps those whose books become textbooks, which is itself a long story. Dismissing the whole idea of linguistic fairness by waving the communalism card is something that we usually expect from unscrupulous politicians, but the Elite (especially of the Left variety) has been doing exactly this ever since the transfer of power to them. Absurd as it may sound, one can understand this if one realizes that they have always felt threatened that some day the vernacular hordes will take the power away from them. There is a great deal they have at stake. I suspect part of their initial vehement opposition against the BJP was motivated by this. And the BJP saw this and made good use of this: they started talking about political untouchability being practiced against them and they gained a lot of sympathy votes on this point alone. The same Elite later became much more tolerant of the BJP once it came to power. Perhaps they accepted it as the fait accompli.

Fait accompli is another card that is heavily used by the Elite. English is the most powerful language that can give you any chance of a decent career and the possibility of some kind of justice so just shut up and try to improve your English. As one strategic think-tanker recently wrote about the Taliban, if you really want to get something done, then you have to go and talk to the people who have power.

As a not so irrelevant aside, consider the paid news affair, which is causing quite a stir these days. Newspapers have been always been used as weapons by both small and big power mongers. While the big newspapers are used more subtly, the smaller ones (with exceptions and to varying degrees) have either been directly owned by the powerful political and corporate people or have been available for hire. But after the Great Indian Reforms and Liberalisation, some big newspapers like the Times of India started the business of paid news quite openly. Till recently, however, there was only a little murmur of protest from the rest of the English Media. Then the ‘vernacular’ newspapers (for whom it is much harder to compete as they get less advertisements and at lower rates) started following the example of the TOI, but they did it more crudely. Suddenly it became a big issue, with even Dilip Padgaonkar telling us what a scourge paid news is.

Why would the editor of a National daily spend the time and effort to write an editorial about every non-committal language related statement from every two penny politician?

The Left part of the Elite is prepared to talk about all kinds of injustices except those related to language. Except when it is Indian language vs. Indian language. In that case it’s great fun for them.

What we actually have is a strange kind of fanatic language chauvinism practiced by the Elite against all Indian languages: more than just fetishist chauvinism. It’s so real that you only need to walk the roads of any Indian city and read the posters (among other things) of English teaching joints.

Not that there are no injustices in the name of Indian languages. The situation very much fits the big-fish-small-fish metaphor. There is also the infinitely indecent situation in Indian villages of there being separate upper caste and Dalit languages. The Dalits are not allowed to use the ‘upper caste language’. Language is used as a tool for domination, oppression and daily humiliation. In this language-eat-language world, the biggest fish by far in India (as in most parts of the world) is English. Even if it is spoken by a miniscule minority.

Trying to cover up this situation with slick diatribes about chauvinism and communalism might go on paying for a long time, but it might also lead to more dangerous situations than what we already have.

I really haven’t believed for one moment that the Thackerays have any love for Marathi. It’s their only possible ticket to power as of now. If they find some other better ticket, they will gladly drop the whole Marathi Manoos issue. The BJP and the Sangh Parivar are a bit more serious about the Hindi part of their slogan, but as their conduct while in power has shown, they care about Hindi only as much as the Bajrang Dal cares about the Indian culture. And everyone knows how much and of what kind that is. I abhor all kinds of chauvinism, but I still think it is an insult to the real chauvinists (like the ones who took part in the anti-Hindi riots a few decades ago) to call the Thackerays (or even the Sangh Parivaris) language chauvinists.

(1) What people like the Thackerays say, goes something like this:

  • Give licenses to taxi drivers only if they are Marathi speakers.
  • If the above is not done, we will get us some North Indian migrants kicked.
  • We will not allow anyone to do whatever we might decide they shouldn’t do.
  • We will thrash anyone who doesn’t agree with us.

(2) Here is what a real chauvinist might say:

  • Marathi is the greatest (or one of the greatest) language(s) in the world.
  • No Marathi speaker should use any word borrowed from any other language.
  • Hindi is actually a corrupted version of Marathi.
  • There is some evidence that the languages of Central Asia are derived from Marathi.

(3) A Marathi fetishist (if there are such people) might say this:

  • I am afraid to read English (or Hindi) books because they bring bad luck to me.
  • I must have a temple in my house to worship Marathi.
  • If my son doesn’t speak Marathi, I think he will become a pervert.
  • The captions of the Playboy centerfolds should be pasted over with Marathi ones before one looks at them.

(4) Then there could also be demands like:

  • English should not be compulsory at the primary level. It should be left to the parents to decide.
  • Students should not be punished for speaking in Marathi.
  • Knowledge of English (or Hindi) should not be compulsory for certain jobs.
  • Marathi writers (and newspapers, magazines, books) should be treated in the same way as English (or Hindi) ones.

There can’t be any debate about (1), (2) and (3), but as far as I can see, the three still have to be treated differently (say, for moral, psychological or political discussion). But there can (and should) definitely be debate about (4). That is, if by democracy you mean something substantial, not just a protective shield to keep your hold on the power indefinitely. If you put all four in the same group and dismiss them all, then there is some chance that this might lead to some bad things, even if Indians are ashamed to use their own language for higher purposes.

To touch upon another taboo topic, a great great deal has been written about Bombay becoming Mumbai, but I don’t remember anyone pointing out that Bombay had already been Mumbai for the Marathi speakers (not to say that it was and is Bambai for Hindi speakers), just as Calcutta had been Kolkata for Bengali speakers and Delhi has been either Dilli or Dehli for Hindi speakers. Is that completely irrelevant?

If we were to take the English Elite’s rhetoric about chauvinism seriously, one would have to call even Orhan Pamuk a language chauvinist. And Satyajit Ray. And Tolstoy. And every French writer. And so on.

In many places in his books Tolstoy resentfully showed how French was treated as the superior language among the Russian Elite and how no one among them wanted to be seen speaking Russian. Except may be when talking to the inferior people: servants, peasants etc.

As one member of the Elite (in a moment of frankness) living in New Delhi narrated in a ‘middle’ in The Hindustan Times several years ago, she was embarrassed when a foreigner from the West came to visit them and tried to talk to them in Hindi. Because for her and for the people in her class, Hindi was a language to be used when talking to vegetable sellers.

Most members of the BJP would love to make a transition to the same class. Some have already done that.

There are schools in India where students are punished for using an Indian language. Not in the class room. Not just for any formal or academic purpose, but even in their private conversation, say while playing in the playground.

So much for chauvinism.

Not to mention the Fetish part.

As for the Thackerays, I wonder why they don’t write their surname as Thakre.

They are defiling the name of one my favourite writers.

February 23, 2009

Hundred (Fictitious) Dollar Oscars

This has been said by someone else, but I will repeat it anyway: if the new ‘Indian’ craze in the West, Slumdog Millionaire, wins one (or possibly more) Oscars, it will be due, to a large extent, to one particular scene in the movie. After the protagonist plays foul with an American (actually the US, lest we forget altogether) tourist couple and is being beaten brutally by an Indian, the American couple rescues him, only to get the retort that ‘you wanted to see a bit of real India’. And the lady’s answer is to get a hundred dollar note (we don’t call it a ‘bill’, nor a ‘bank note’) from her husband and hand it to the offending boy with what we call a dialogue in Hindi, ‘well, here is a bit of real America, son’. As the person who mentioned this scene earlier (although I had thought of the scene in more or less the same way) also pointed out, this American lady (that’s what we now call a woman in Hindi) is shown to be the only really good person in the whole movie.

But the movie is supposed to be all about Indians, so there are no real people other than Indians except this lady. The only Western (White and presumably Christian) person in the whole movie can hardly not be a representative of the average Westerner (let alone the US Americans) as opposed to the wretched, written-to-be-wretched, Indians, especially when she makes such a grand gesture accompanied by a solid dialogue.

Since there still are people out there who are going to (or already have) criticize this movie for some crappy reason like selling India’s poverty to the West etc., one has to give out the mandatory disclaimer that one is most certainly not against this movie for any such reason. In fact, one is not really against this movie at all.

I most probably wouldn’t have commented on this movie had it not become such a sensation and also given that a lot of insightful commentators have already written about it. But now it looks very much likely that the movie is going to get that most-prestigious-in-the-globe-but-actually-the-US-American movie award named Oscar, and probably more than one. This means that the movie will be taken seriously by a lot of non-Indians and perhaps even by some Indians. And, as I indicated earlier, it is not really such a bad movie. The problem is that it is not a great movie at all, which is what it is being made out to be outside India.

And like one other commentator (pardon me for not giving references, but I am tired right now: though I can provide them on need), I find it hard to believe that it is directed by the same person who directed that movie which is in my list of Very Good Movies (in the company of movies by Bergman, Fellini, De Sica, Kubrick and the like), namely Trainspotting. Whereas that movie was exactly what it wanted to be, this movie almost fails completely, although it is still entertaining.

There are so many things which are fundamentally and very clearly wrong with this movie. Accent is, of course, one of them. I wonder whether Danny Boyle knows that the knowledge of English (and even more so its use with a particular accent) is the single most reliable indicator of one’s socio-economic status in the Indian subcontinent. And the movie shows the ‘slumdog’ using the highest caste accent whereas the elite TV show host using a pretty low caste accent (yes, Anil Kapur’s accent is not very ‘good’ and he would usually be looked down upon among a circle of people speaking in almost British accent, as does the protagonist).

I would urge Danny and his crew to go and see Tashan, which has some similarities with this movie and also stars Anil Kapur.

The movie could have been so much better if it was made in Hindi and had better casting and had hired some accent tutors like they do in Hollywood even for the all-(US)-American movies.

The second big problem is that the novel on which it is based doesn’t talk Karma-Varma at all. And the movie resolves everything at the end by saying ‘because it is written’. And Danny Boyle himself in an interview (roughly) said that you simply can’t resolve the complexities of India: they are just there. Then he said ‘they even have a philosophy for this’, which says to me that he seems to know very little about India. Yes, there is a philosophy of that kind, but there are innumerable other philosophies too.

Come on, Danny, no one in India actually says ‘I don’t know, I have got a sort of Karmic feeling about this’ or something like that (as the TV show host does). This Karmic terminology is more used in the West, than in the ‘real India’. No one really talks about ‘Karma’ here. (Even when they do, they don’t do it in this way). Though they do talk of Bhaagya and Taqdeer and Maathe Ki Lakeer etc. Which is not the same thing. And which is the reason this movie can be accused of being indulgent in post-modern Orientalism (someone else said that too).

In many parts of India, if you spoke out the word ‘Karma’ in the way Danny Boyle (or any Westerner talking about India) does, people would think you were talking about a patriotic movie starring an old Dilip Kumar pairing with one of my favorite (favourite for the less dominant party to which Danny belongs) female actors, Nutan. This ‘Karma’ is, of course, not the same word. In fact, it’s not a word at all: it’s a name.

It’s an ambiguous Named Entity that I would classify as either a Person or as an Object-Title, depending on the context.

In the same interview, Danny Boyle says about Mumbai (which we still quite often call Bambai – बंबई in Hindi and Bombay in English) that ‘they call it the Maximum City’. Well, it’s actually Suketu Mehta who calls Mumbai that. A lot can be said about that book too, but I won’t say it now.

Now the music. Well, the simple and solid fact is that A. R. Rehman has given much better music before, right from his very first hit, Roja. If some Indians start respecting him now because he wins an Oscar or two, I can only pity them. And I pity the non-Indians too: for being completely unaware of such great music even in this .mp3 era. Music which has been heard and liked by hundreds of millions of people for more than one and a half decade now.

But let me reiterate. This is not such a bad movie. Your money won’t be wasted if you go and see it. But it is definitely not ‘a gritty and realistic’ movie about India, except in some ways which are of no use to an Indian and could be misleading for a non-Indian.

Let me reiterate something else. The Indian ‘reality’ is much worse than what is depicted in the movie, which is basically a lived-happily-ever-after fantasy.

And featuring the US American lady in the movie with her fictitious hundred dollars is a cheap (pun intended) trick to win over the Western (especially American) audiences whose senses will be offended by what is shown in the movie (for the dummies: this is a deliberate but slight exaggeration). Because if the truth were told, a big share (not all, of course) of the responsibility of this worse reality of India (as of other colonized or near-colonized countries) rests with the West.

Overall, Slumdog Millionaire is in the same league as Baz Luhrmann’s Moulin Rouge. Both movies are inspired by the ‘Bollywood’ style of film making and both have directors who seem to know precious little about India but who wanted to pay some tribute to the country and its films, just as the earlier Orientalist artists paid their own tributes to the seductive, exotic East as imagined by them with their artistic temperament. But as an Indian I feel that the latter movie has a definite edge. That could be partly because it doesn’t pretend to know (and, therefore, tell) much about India.

Slumdog Millionaire’s only connection to Trainspotting, ironically, happens to be a scene that was hard to watch even for the hardened Indians: the jump in and out of the shitpot. And even this scene was done much better in Trainspotting.

There is also a serious matter that is concerned with both the style as well as the content. It’s a very tricky matter to mix realism with fantasy, which is what Slumdog Millionaire tries to do. And it does quite a bad job of it.

As it happens, Danny Boyle came and lived in India for some time for making this movie. One gets the impression that he was overwhelmed by what he saw and didn’t quite know what to make of it. And in such cases the easiest resort is to the Karmic poppycock that the movie ends at. Small mercy that it is done with the tongue at least lightly in the cheek.

P.S.: Also for the dummies, the word ‘caste’ above has been used metaphorically, not literally. Knowledge of English and the accent is a big (perhaps the biggest) determinant of the metaphorical caste in India. Even in the India of Call Centres. Or should it be ‘especially in that India’?

January 12, 2009

Picture of the Future

Orwell described a picture of the future rather bleakly as:

There will be no curiosity, no enjoyment of the process of life. All competing pleasures will be destroyed. But always—do not forget this, Winston—always there will be the intoxication of power, constantly increasing and constantly growing subtler. Always, at every moment, there will be the thrill of victory, the sensation of trampling on an enemy who is helpless. If you want a picture of the future, imagine a boot stamping on a human face … forever. (1984 by George Orwell: Part III, Chapter III)

This, I believed, was a dystopian picture. I still do. I have my own picture of the future, which has remained almost unchanged for the last decade (at least). Three recent events somehow seem to me to be describing my picture of the future.

The picture is mine, but the future need not necessarily be mine.

But it can very well be.

The first is the unbelievably and blatantly criminal assault by Israel on all Palestinians: man, woman and child. I won’t give references for this. It’s there prominently even in the mainstream media and has been there for some time now.

The second is a recent call by the Andhra Pradesh Human Rights Commission chief (Chairman) for “legislation to prosecute parents with diseases such as tuberculosis, HIV, leprosy and dyslexia should they, knowing that they have the disease, have children”.

Inhuman Rights Commission?

The third is the news, or rather the lack of it, about the recent death of a Hindi writer living in Jaipur (yes, the connection with ‘your’ places does make it worse) Lavleen (लवलीन) who was relatively young. She had a reputation as a ‘bold’ writer and woman. She hadn’t really established herself as a great writer, but she was known among the Hindi literary circles. Let alone the Indian English media, (it has been pointed out) even the ‘biggest Hindi daily’ Dainik Bhaskar didn’t report it, even after many requests. And even the small but very vibrant and inter-connected world of Hindi blogging (which is very enthusiastic about events like the wedding of someone’s relative among them) mostly ignored it, though they are trying very hard to find out who ‘the real Tau’ (असली ताऊ) is. Like a lot of other writers, she died with the dream of some day writing a masterpiece.

(But still, I came to know about this from a Hindi writer’s blog).

And, no, I didn’t personally know her. Nor do I know the A. P. Human Rights Commission Chairman. Nor have I ever been to Israel, though a large percentage of the people (in History) I admire happen to be Jewish and most of them (I am sure) would have or have been horrified by what Israel is doing.

I don’t know why but these three events (or should I say sets of events: being a ‘professional’ practitioner of language sciences, crafts and arts is tough when it comes to writing anything) somehow represent for me the picture of the future.

This picture is not quite as horrible as that painted by Orwell (actually, by O’Brien the character, whether or not by the author).

But it doesn’t seem very pleasant.

October 28, 2008

सांगणिक भाषाविज्ञान

जैसा मैंने पिछली प्रविष्टी (‘पोस्ट’ के लिए यह शब्द इस्तेमाल हो सकता है?) में लिखा था, अगले कुछ हफ्तों में मैं संचय के बारे में लिखने जा रहा हूं।

लेकिन क्योंकि संचय खास तौर पर (आम उपयोक्ताओं के अलावा) सांगणिक भाषाविज्ञान या भाषाविज्ञान के शोधकर्ताओं के लिए बनाया गया है, इस बात को साफ कर देना ठीक रहेगा कि सांगणिक भाषाविज्ञान या भाषाविज्ञान के माने क्या है, या अगर आप इनके माने जानते ही हैं तब भी इनसे मेरा अभिप्राय क्या है। यह दूसरी बात इसलिए कि इन विषयों (सांगणिक भाषाविज्ञान या भाषाविज्ञान) के अर्थ के बारे में आम लोगों में तो तमाम तरह की ग़लतफ़हमियाँ हैं ही, पर इन विषयों के शोधकर्ताओं में भी इनकी परिभाषा पर एक राय नहीं है।

सच तो यह है कि हिंदी जगत में तो अब भी अधिकतर लोग भाषाविज्ञान का अर्थ उस तरह के अध्ययन से लगाते हैं जो पिछली सदी के शुरू में लगाया जाता था। लेकिन बहस की इस दिशा में अभी मैं नहीं जाना चाहूंगा क्योंकि इसके बारे में कहने को इतना अधिक है कि अभी जो उद्देश्य है वो पीछे ही रह जाएगा।

वैसे सांगणिक भाषाविज्ञान या भाषाविज्ञान की परिभाषा या उनकी सीमाओं के बारे में भी कहने को बहुत-बहुत कुछ है, पर फिलहाल थोड़े से ही काम चलाया जा सकता है।

तो छोटे में कहा जाए तो भाषाविज्ञान शोध या अध्ययन का वह विषय है जिसमें किसी एक भाषा के व्याकरण का ही अध्ययन नहीं किया जाता बल्कि नैसर्गिक या मानुषिक (यानी कृत्रिम नहीं) भाषा का वैज्ञानिक रूप से अध्ययन किया जाता है। अब यह धारणा व्यापक रूप से स्वीकृत है कि मानव मस्तिष्क की संरचना का भाषा की संरचना से सीधा संबंध है और क्योंकि सभी मानवों के मस्तिष्क की संरचना मूलतः एक ही जैसी है, तो सभी नैसर्गिक या मानुषिक भाषाओं में भी सतही लक्षणों को छोड़ कर बाकी सब एक ही जैसा है। इसीलिए, जैसा कि इन विषयों के आधुनिक साहित्य में प्रसिद्ध है, अगर किसी अमरीकी के शिशु को जन्म के तुरंत बाद कोई चीनी परिवार गोद ले ले और वह बच्चा चीन में ही पले तो वह उतनी आसानी से चीनी बोलना सीखेगा जितनी आसानी से कोई चीनी परिवार का बच्चा। ऐसी ढेर सारी और बातें हैं, पर मुख्य बात है कि भाषाविज्ञान नैसर्गिक या मानुषिक भाषा का वैज्ञानिक अध्ययन है।

कम से कम कोशिश तो यही है कि अध्ययन वैज्ञानिक रहे, पर वो वास्तव में रह पाता है या नहीं, यह बहस का विषय है।

अब सांगणिक भाषाविज्ञान पर आएं तो इस विषय में हमारा ध्यान मानवों की बजाय संगणक यानी कंप्यूटर पर आ जाता है, पर पिछली शर्त फिर भी लागू रहती है: नैसर्गिक या मानुषिक भाषा का वैज्ञानिक अध्ययन। अंतर यह है कि हमारा उद्देश्य अब यह हो जाता है कि कंप्यूटर को इस लायक बनाया जा सके कि वो नैसर्गिक या मानुषिक भाषा को समझ सके और उसका प्रयोग कर सके। जाहिर है यह अभी बहुत दूर की बात है और इसमें कोई आश्चर्य भी नहीं होना चाहिए क्योंकि अभी भाषाविज्ञान में ही (पिछली सदी की असाधारण उपलब्धियों के बाद भी) वैज्ञानिक ढेर सारी बाधाओं में फंसे हैं।

फिर भी, सांगणिक भाषाविज्ञान में काफ़ी कुछ संभव हो चुका है और काफ़ी कुछ आगे (निकट भविष्य में) संभव हो सकता है। लेकिन इसमें कंप्यूटर का मानव जैसे भाषा बोलना-समझना शामिल नहीं है। जो शामिल है वो हैं ऐसी तकनीक जो दस्तावेजों को ज़्यादा अच्छी तरह ढूंढ सकें, उनका सारांश बना सकें, कुछ हद तक उनका अनुवाद कर सकें आदि।

लेकिन हिंदुस्तानी परिप्रेक्ष्य में परेशानी यह है कि हम अभी इस हालत में भी नहीं पहुंचे हैं कि आसानी से कंप्यूटर का एक बेहतर टाइपराइटर की तरह ही उपयोग कर सकें। इस दिशा में कुछ उपलब्धियाँ हुई हैं, पर अंग्रेज़ी या प्रमुख यूरोपीय भाषाओं की तुलना में हम कहीं भी नहीं हैं। जैसा कि आपमें से अधिकतर जानते ही हैं, यह एक लंबी कहानी है जिसे अभी छोड़ देना ही ठीक है।

पर संचय का विकास इसी परिप्रेक्ष्य में किया गया है, जिसके बारे में आगे बात करेंगे।

October 26, 2008

संचय का परिचय

पिछली पोस्ट (शर्म के साथ कहना पड़ रहा है कि पोस्ट के लिए कोई उपयुक्त शब्द नहीं ढूंढ पा रहा हूं) में मैंने (अंग्रेज़ी में) संचय के नये संस्करण के बारे में लिखा था। मज़े की बात है कि संचय के बारे में मैंने अभी हिंदी में शायद ही कुछ लिखा हो। इस भूल को सुधारने की कोशिश में अब अगले कुछ हफ्तों में संचय के बारे में कुछ लिखने का सोचा है।

तो संचय कौन है? या संचय क्या है?

पहले सवाल का तो जवाब (अमरीकी शब्दावली में) यह है कि संचय एक सिंगल पेरेंट चाइल्ड है जिसे किसी वेलफेयर का लाभ तो नहीं मिल रहा पर जिस पर बहुत सी ज़िम्मेदारियाँ हैं।

दूसरे सवाल का जवाब यह है कि संचय सांगणिक भाषाविज्ञान (कंप्यूटेशनल लिंग्विस्टिक्स) या भाषाविज्ञान के क्षेत्र में काम कर रहे शोधकर्ताओं के लिए उपयोगी सांगणिक औजारों का एक मुक्त (मुफ्त भी कह सकते हैं) तथा ओपेन सोर्स संकलन है। पर खास तौर से यह कंप्यूटर पर भारतीय भाषाओं का उपयोग करने वाले किसी भी व्यक्ति के काम आ सकता है। इसकी एक विशेषता है कि इसमें नयी भाषाओं तथा एनकोडिंगों को आसानी से शामिल किया जा सकता है। लगभग सभी प्रमुख भारतीय भाषाएं इसमें पहले से ही शामिल हैं और संचय में उनके उपयोग के लिए ऑपरेटिंग सिस्टम पर आप निर्भर नहीं है, हालांकि अगर ऑपरेटिंग सिस्टम में ऐसी कोई भी भाषा शामिल है तो उस सुविधा का भी आप उपयोग संचय में कर सकते हैं। यही नहीं, संचय का एक ही संस्करण विंडोज़ तथा लिनक्स/यूनिक्स दोनों पर काम करता है, बशर्ते आपने जे. डी. के. (जावा डेवलपमेंट किट) इंस्टॉल कर रखा हो। यहाँ तक कि आपकी भाषा का फोंट भी ऑपरेटिंग सिस्टम में इंस्टॉल होना ज़रूरी नहीं है।

संचय का वर्तमान संस्करण 0.3.0 है। इस संस्करण में पिछले संस्करण से सबसे बड़ा अंतर यह है कि अब एक ही जगह से संचय के सभी औजार इस्तेमाल किए जा सकते हैं, अलग-अलग स्क्रिप्ट का नाम याद रखने की ज़रूरत नहीं है। कुल मिला कर बारह औजार (ऐप्लीकेशंस) शामिल किए गए हैं, जो हैं:

  1. संचय पाठ संपादक (टैक्सट एडिटर)
  2. सारणी संपादक (टेबल एडिटर)
  3. खोज-बदल-निकाल औजार (फाइंड रिप्लेस ऐक्सट्रैक्ट टूल)
  4. शब्द सूची निर्माण औजार (वर्ड लिस्ट बिल्डर)
  5. शब्द सूची विश्लेषण औजार (वर्ड लिस्ट ऐनेलाइज़र ऐंड विज़ुअलाइज़र)
  6. भाषा तथा एनकोडिंग पहचान औजार (लैंग्वेज ऐंड एनकोडिंग आइडेंटिफिकेशन)
  7. वाक्य रचना अभिटिप्पण अंतराफलक (सिन्टैक्टिक ऐनोटेशन इंटरफेस)
  8. समांतर वांगमय अभिटिप्पण अंतराफलक (पैरेलल कोर्पस ऐनोटेशन इंटरफेस)
  9. एन-ग्राम भाषाई प्रतिरूपण (एन-ग्राम लैंग्वेज मॉडेलिंग टूल)
  10. संभाषण वांगमय अभिटिप्पण अंतराफलक (डिस्कोर्स ऐनोटेशन इंटरफेस)
  11. दस्तावेज विभाजक (फाइल स्प्लिटर)
  12. स्वचालित अभिटिप्पण औजार (ऑटोमैटिक ऐनोटेशन टूल)

अगर इनमें से अधिकतर का सिर-पैर ना समझ आ रहा हो तो थोड़ा इंतज़ार करें। आगे इनके बारे में अधिक जानकारी देने की कोशिश रहेगी।

शायद इतना और जोड़ देने में कोई बुराई नहीं है कि संचय पिछले कुछ सालों से इस नाचीज़ के जिद्दी संकल्प का परिणाम है, जिसमें कुछ और लोगों का भी सहयोग रहा है, चाहे थोड़ा-थोड़ा ही। उन सभी लोगों के नाम संचय के वेबस्थल पर जल्दी ही देखे जा सकेंगे। ये लगभग सभी विद्यार्थी हैं (या थे) जिन्होंने मेरे ‘मार्गदर्शन’ में किसी परियोजना – प्रॉजेक्ट – पर काम किया था या कर रहे हैं।

उम्मीद है कि संचय का इससे भी अगला संस्करण कुछ महीने में आ पाएगा और उसमें और भी अधिक औजार तथा सुविधाएं होंगी।

October 5, 2008

Good News and Bad News on the CL Front

First, as the saying goes, the bad news. We had submitted a proposal for the Second Workshop on NLP for Less Privileged Languages for the ACL-affiliated conferences. That proposal has not been accepted. Total proposals submitted were 41 and 34 out of them were accepted. Ours was among the not-accepted seven (euphemisms can be consoling).

Was is that bad? I hope not.

Don’t those capital letters look silly in the name of a rejected proposal?

Now the good news. The long awaited new version of Sanchay has been released on Sourceforge. (Well, at least I was awaiting). This version has been named (or numbered?) 0.3.0.

The new Sanchay is a significant improvement over the last public version (0.2). It now has one main GUI from which all the applications can be controlled. There are twelve (GUI based) applications which have been included in this version. These are:

  • Sanchay Text Editor that is connected to some other NLP/CL components of Sanchay.
  • Table Editor with all the usual facilities.
  • A more intelligent Find-Replace-Extract Tool (can search over annotated data and allows you to see the matching files in the annotation interface).
  • Word List Builder.
  • Word List FST (Finite State Transducer) Visualizer that can be useful for anyone working with morphological analysis etc.
  • One of the most accurate Language and Encoding Identifier that is currently trained for 54 langauge-encoding pairs, including most of the major Indian languages. (Yes, I know there is a number agreement problem in the previous sentence).
  • A user friendly Syntactic Annotation Interface that is perhaps the most heavily used part of Sanchay till now. Hopefully there will be an even more user friendly version soon.
  • A Parallel Corpus Annotation Interface, which is another heavily used component. (Don’t take that ‘heavily’ too seriously).
  • An N-gram Language Modeling Tool that allows you to compile models in terms of bytes, letters and words.
  • A Discourse Annotation Interface that is yet to be actually used.
  • A more intelligent File Splitter.
  • An Automatic Annotation tool for POS (Part Of Speech) tagging, chunking and Named Entity Recognition. The first two should work reasonably well, but the last one may not be that useful for practical purposes. This is a CRF (Conditional Random Fields) based tool and it has been trained for Hindi for these three purposes. If you have annotated data, you can use it to train your own taggers and chunkers.

All these components use the customizable language-encoding support, especially useful for South Asian languages, that doesn’t need any support from the operating system or even the installation of any fonts, although these can still be used inside Sanchay if they are there.

More information is available at the Sanchay Home.

The capitals don’t look so bad for a released version.

The downside of even this good news is that my other urgent (to me) work has got delayed as I was working almost exclusively on bringing out this version for the last two weeks or so.

But then you need a reason to wake up and Sanchay is one of my reasons. And I can proudly say that a half-hearted attempt to generate funding for this project by posting it on Micropledge has generated 0$.

Sanchay is still alive as a single parent child without any welfare but with a lot of responsibilities.

Now I can have nightmares about the bugs.

March 31, 2008

The Hemingway (or Pilar) Argument for Diversity

Innumerable arguments can be given in favor (favour for the non-dominant party) of diversity. That is, diversity of all kinds: cultural, ecological, linguistic etc. But in this post I present a particularly good one. It’s from Hemingway’s ‘For Whom the Bell Tolls’, which I am reading right now:

‘Then calm yourself. There is much time. What a day it is and how I am contented not to be in pine trees. You cannot imagine how one can tire of pine trees. Aren’t you tired of pines, guapa?’

‘I like them,’ the girl said.

‘What can you like about them?’

‘I like the odour and the feel of the needles under foot. I like the wind in the high trees and the creaking they make against each other.’

‘You like anything,’ Pilar said. ‘You are a gift to any man if you could cook a little better. But pine trees make a forest of boredom. Thou hadst never known a forest of beach, nor of oak, nor of chestnut. Those are forests. In such forests each tree differs and there is character and beauty. A forest of pine trees is boredom. What do you say, Inglés?’

‘I like them too.’

Pero, venga,’ Pilar said. ‘Two of you. So do I like pines, but we have been too long in these pines. Also, I am tired of the mountains. In mountains there are only two directions. Down and up and down leads only to the road and the towns of the Fascists.’

The forest analogy is good enough in itself, but I really liked the natural connection at the end between the lack of diversity and Fascism.

I don’t need to remind that diversity is fast eroding from every sphere of life. Even in India, the land of more diversity than perhaps any other. I also don’t need to remind that Fascism is rising in almost all regions of India, in various forms. Neither do I need to remind what is being used as a cover for rising Fascism. Yes, the T-word, which is sometimes equated to the M-word and sometimes to the N-word. With a lot of talk about the W-word.

There is no exaggeration here in the use of the F-word, although I do use the device of exaggeration sometimes.

And no, there are no mistakes in the language used in the quote due to my typing. This is just a mild example of how Hemingway represented Spanish speech in English.

March 7, 2008

Transcribing Romance on Your Menu

It makes us feel that we are all extras in somebody else’s movie.

That’s a comment someone made about the movie I am going to write about today. I am not the kind of person who likes to watch the same movie again and again. But there are exceptions. So I do watch some movies more than once. And this one is a movie I have watched the second highest number of times.

From what I have written so far about movies, the regular readers of this blog (assuming there are any), might have got the impression that I am a very dry kind of person. Always talking about serious movies. And always talking about only the serious (political, philosophical, psychological) themes in all movies.

I am not going to do that in this post. Not because I want to prove something (there goes an apology). Just that this particular movie doesn’t have anything serious to say about life. And, therefore, I don’t have anything serious to say about the movie either. (Well, yes, this is more of an exaggeration than a literal truth).

But I still have watched this movie the second highest number of times (for me of course). And will definitely watch it again. More than once.

Like the other movie that I have watched the highest number of times (for me of course), this movie too was a big surprise.

In how many non-Indian movies will you find a Punjabi folk song on the soundtrack? A song like the one transcribed below.

This is one other very unusual unme-like thing I am going to do in this post. Transcribing the lyrical and poetic parts of the soundtrack of a non-serious movie. There might be some mistakes in the transcription (there goes a disclaimer), but then I won’t be the only one to do that (there goes an excuse). Just a few days ago I bought a sackful of second hand books (all in English: good Hindi books don’t have a market, even a second hand market) from a roadside Sunday book bazaar. One of the things I bought was a booklet titled ‘Joyful Hearts (For Private use only)’. It had lyrics of popular songs in several languages, all transcribed in the Latin script. One of them (California Dreamin’) is on the soundtrack of the movie I am writing about. I too have transcribed it below, but I have done so from the movie. The version in the booklet wrongly contains the word ‘in a lay’ instead of ‘in L.A.’. Actually, the task for me was easier (for English songs) because the subtitles also had the lyrics. But the Hindi and Punjabi words I had to transcribe on my own. And if I remember correctly, even the subtitles had some mistake in the transcription of an English song.

Anyway, here is the Punjabi folk song:

पिपलां दी ठंडी-ठंडी
छाँ वरगी
सत्थ मैनूं लग्गे
मैनूं वरगी

मैं वी उन पुच्छ के
बैर कर दी

So, how many foreign (non-Indian) movies will have this kind of real and really beautiful folk song that is hard to find even in India? (I am talking about music more than the words. Unfortunately, I can’t transcribe the music).

Even in an India where, while Punjabi as a distinct language is going down the extinction path as much as any other language except the lucky handful, certain aspects of Punjabi culture are making inroads even in the South. And music is one of those aspects. But, tragically (I mean it: I don’t use words lightly), the Punjabi music that is proliferating is of the worst kind.

And how many foreign movies will have light classical Hindustani music with words like this:

बदरवा बरसन लाई
लाई फूहारों की लड़ाई
पवन चलत पुरवाई
बदरवा …

As this is Hindustani classical music, even if light one, the words give very little indication of the beauty of the music. Unless you have a gift for discovering the music hidden within the words. A well known Hindi film music director used to say that all songs (i.e., lyrics) have music hidden within them. You just have to find that music and you can get the right composition for the song. I think he was at least partially right (there go weasel words).

But the one that follows takes the cake. In how many movies will you find hardcore poetry in hardcore standard Hindi. The shuddh Hindi. The pure Hindi. Even I don’t understand everything in this poem. And, I am ashamed to say, I don’t even know whose poem it is.

गर्जन भैरव संसार
हँसता है बहता कल कल
देख देख नाचता हृदय
बहने को महाविकल बेकल
इस मरूर से
इसी शूर से
सघन भूर गुरू गहन रूर से
मुझे गगन का दिखा
सघन वह छोर
राग अमर अंबर में भरने जरूर

ए वर्ष के हर्ष
बरस तू बरस पर तरस खा कर
मार दे चल तू मुझ को
बहार दिखा मुझ को

गर्जन भैरव संसार
हँसता है नर खल खल
बहता कहता
बुद बुद कल कल
देख देख नाचता हृदय

This poem, like other songs in the movie, is played in more than one bits and is employed as the musical theme of a certain bit of the ‘story’.

There is not much of a story though. What you see in this movie, what made me watch it the second highest number of times, and what made this one of Tarantino’s favorite movies, is simply cinematic magic.

Magic created out of photography, choreography, composition, colors, music, musical words and romance. Simple almost unreal and surreal romance made magical.

By the way, the movie is called ‘Chung King Express’ and is directed by Wong Kar Wai. And it stars a very good looking star cast consisting of Brigitte Lin, Tony Leung Chiu Wai (the smaller, who is a bigger super star than the bigger Tony Leung of ‘The Lovers’), Faye Wong (who was already a pop star), Takeshi Kaneshiro (who actually knows four languages and uses them all in this movie) and Valerie Chow.

The movie also has a song from one of Faye Wong’s albums which I couldn’t transcribe as I neither know the language nor the script.

I have a feeling that this movie has influenced a lot of people working in the realm of popular culture.

It is also influenced by a lot of other creations by other people working in the realm of popular culture.

It’s not every day
We are gonna be
The same way
There must be a change
Somehow

There are bad times
And good times too
So have a little faith in
What you do, oh yeah
Getting happy, yeah
I want you to understand, yeah

The movie actually has two interwoven stories (CLICHE!). Roger Ebert may be right in saying that watching this movie is a cerebral exercise as you like this movie because of what you know about it, not what it knows about life.

But Roger Ebert can be horribly wrong sometimes. Like when he wrote a review of Malena. I will just quote Michael DeZubiria to point out how unbelievably wrong the best known movie reviewer in the world can be (there goes a marathon digression):

Roger Ebert wrote probably the most idiotic review I’ve ever seen him come out with about this movie. He missed the point of this movie even more than he missed the point of Memento, and his review of that movie was like a blind man describing a shooting star. He describes Malena as a schoolteacher “of at least average intelligence, who must be aware of her effect on the collective local male libido, but seems blissfully oblivious.”

Roger, seriously, are you joking? BLISSFULLY?? Did you sleep through this movie?

She almost never speaks at all and never displays even the slightest hint of a smile. Given the extent of her depression and stifling sadness, it is astounding to me that anyone in their right mind could attach the word “blissfully” to any element of her character.

I know what that’s like though, because sometimes I completely miss something about a movie and I think that something else is the stupidest thing in the world because of it, at least until someone explains what I missed and then it all makes sense. Watch Malena, for example, walking through the central square in town at any point in the movie. If you think she keeps her eyes on the ground directly in front of her because she is in a state of pure, ignorant bliss, then trust me. You are missing something.

I don’t know if Malena was actually unaware of the effect that she had on the townspeople, but I find it nearly impossible to believe that she did. That thought actually never even occurred to me until I read Roger Ebert’s gem of a review. Her behavior struck me much more like someone who had been dealing with such behavior from the men around for her whole life. I doubt very much that she doesn’t understand the concepts of human physical attraction.

Coming back to the current movie, I can say with a crystal clear conscience (I don’t like to lie too much) that this is one of the best movies about plain and simple ‘love’ type romance.

What a difference
A day makes
Twenty four little hours
Brought the sun and the flowers
Mmm, where there used
To be rain

My yesterday was blue, dear
Today I am a part of you, dear
My lonely nights are through, dear
Since you said you were mine

Lord, what a difference
A day makes
There’s a rainbow before me
Skies above can’t be stormy
Since that moment of bliss
That thrilling kiss

It’s heavens when you
Find romance
On your menu

What a difference
A day made
And the difference
Is you

But then it is a movie by the master of nostalgia. Wong Kar Wai can make you feel extremely (I don’t use adjectives or adverbs lightly) nostalgic even about places where you have never been. He can even make you feel nostalgic twice removed. In this movie he first makes you nostalgic about Hong Kong (even if you have never been there) and then he makes you feel nostalgic about California (even if you have never been there) from Hong Kong. And all this time you (there goes projection) are sitting in a man made cave in India.

All the leaves are brown
And the sky is gray
I’ve been for a walk
On a winter’s day
I would be safe and warm
If I was in L.A.

California dreamin’
On such a winter’s day

Stopped into a church
I passed along the way
Well, I got down on my knees
And I pretend to pray
You know the preacher likes cold
He knows I am gonna stay

California dreamin’
On such a winter’s day

If I didn’t tell her
I could leave today

California dreamin’
On such a winter’s day

I was (along with the person who gave that movie to me) fascinated by the soundtrack of another one of Wong Kar Wai’s movies, ‘In the Mood for Love’. But ‘Chung King Express’ beats even that movie. It has one of best soundtracks in the history of movies. In fact, I have watched it sometimes just for the soundtrack. And I am not really crazy about movie soundtracks.

Tarantino has claimed that everyone that he knows who watched this movie (he only knows men, or, more likely, he only counts men) had a crush on Faye (who is named Faye in the movie too).

A tribute from the king of cinematic non-serious violence to the king of cinematic non-serious romance.

So, whenever you want romance on your menu, go to Wong’s. They serve the best there. You will find yourself visiting frequently.

Even if there was nothing else, I will still watch this movie to listen to the Hindi poem being played on the TV, accompanied on the soundtrack by many other sounds.

Hindi poem on cinema. Foreign cinema. Now there’s a rare thing for you if there ever was one. Even if it forms the backdrop of an almost comic botched small time drug smuggling operation involving many very bad looking lower class ‘Indians’ who are actually Pakistanis.

February 29, 2008

English is Language Independent

It’s the Global Language, right? So how can it be language dependent? You propose a theory based on English. It has to apply to all languages. You propose a Natural Language Processing (NLP) or Computational Linguistics (CL) technique for a particular problem. For English. It applies to all languages. You build a software for some purpose. For English. It has to be useful for all languages. You build a dictionary…

Never mind.

But the vice versa is not true. You propose a theory based on Hindi. It is language specific. It doesn’t count for much. You propose an NLP technique for a particular problem. For Hindi. It is language specific. It doesn’t count for much. You build a software for some purpose. For Hindi. It is language specific. It doesn’t count for much.

That’s how it works in practice, if not theory. Or may be even in theory, with some help from the (very valid) idea of Universal Grammar (except that the UG may be the UG of English).

Even today I have got a review of a paper on a problem which is like one of the holy grails of NLP or CL. One of the comments is that the approach has been evaluated on Hindi so it can’t be compared to other techniques that already exist. True. But what is the number of papers published in the ‘first class’ NLP/CL conferences and journals in which the approach has been tried only on English? Doesn’t matter, because English is language independent. If you only evaluate your technique on English, that’s OK. But if you evaluate on only Hindi, that’s not acceptable. Because Hindi is language specific.

We know this very well in India. The Elite talks about (Indian) literature. And sometimes the Elite magnanimously (or dismissively) talks about (Indian) literature in languages. The first, of course, refers to literature in English. The second refers to literature in other languages. Indian languages.

The Elite talks of media. And the Elite (rarely and mostly negatively) talks of language media.

Hindi is a language. English is not a language.

Pardon me.

Hindi is a language. English is the language.

English is above being merely a language.

That’s why all the work done in English is language independent. Not just research. Not just in NLP/CL. Anything. Movies, literature, music.

I am guilty of the sin of indulging too much in mere languages. I should be working mostly on English. Not just writing blog posts in English. Sometimes, of course, I can bestow a bit of my attention on languages. Like Hindi.

But I won’t do that. I will do the opposite. I am incurable.

October 8, 2007

On (My) Linguistic Doublewrite

Filed under: Language Problem,Things As They Are — anileklavya @ 6:12 am

Some readers of this blog must have noticed that I talk about using Indian languages and providing support for them, but almost all my posts till now are in English. Well, I had started this blog with the intention that I will write mostly in Hindi. But except the About section and the first short post about Hindi ZNet, I have only written in English.

I plead guilty to the charge of limited linguistic doublewrite with respect to this blog. Whatever may be the situation, we can always have excuses, and sometimes it’s very hard to separate the excuses from the genuine reasons. In fact, a genuine reason may become an excuse (to yourself) if your level of commitment increases. I also have some excuses. Or reasons. First is, of course, that my years of higher education and professional reading and writing have ensured that even I find it easier to write, and even more importantly, type in English. So, the reason (or one reason?) is convenience (possibly to be decoded as frustration?). When viewed from a higher level of commitment, it becomes an excuse: to cover laziness or dilution of commitment.

Am I going to mend my ways? I will try. I won’t stop writing in English here, but hopefully there will be more posts in Hindi. Hopefully (again) in the near future (the MS Word grammar checker says it’s a fragment and is asking me to consider revising: I decline).

Does it sound very promising?

Hope so.

Create a free website or blog at WordPress.com.