अनिल एकलव्य ⇔ Anil Eklavya

September 5, 2011

The Missing Clause

There is a legal agreement written in very legal language that I had to read today. It’s called Mutual Confidentiality Agreement and is required to be signed by two parties who plan to collaborate on some commercial product or service.

After having plodded through the legalese and having understood most of it (I have an advantage in this regard), I found that there was one clause that was glaringly missing from it.

The document lists all the conditions that apply when the Disclosing Party discloses something to the Recipient. It has a section euphemistically titled ‘Injunctive Relief’ that might send the shivers down the Recipient’s spine, depending on the power balance. It also lists all the exceptions under which these conditions may not apply. Such conditions include “court order” and “as required by law”.

What is missing is something that should be included in all such documents post-9/11, in all countries that went for the security Gold Rush, which practically means all countries, (almost) period.

That missing clause should go something like this:

An (unintended) disclosure by the Recipient to any number of third parties of any of the Disclosing Party’s Confidential Information will not be considered a breach of the agreement if it happens under any of the following conditions:

  1. As part of surveillance operations carried out by the State and any of its agencies, the institution in which the Recipient works or any part thereof, the Local Version of the Truman Show, the Connectivity Service Providers, the Private Security Companies, the Local Quasi-authorised Vigilante Organisations or any other such agencies added to the list till the eve of the day the breach is considered for scrutiny.
  2. [Talking of eve] As a result of eavesdropping by the agencies and organisations listed in 1.
  3. As a result of disclosure by the people involved in (a) surveillance and (b) eavesdropping by the agencies and organisations listed in 1 to any of their superiors, colleagues, sub-ordinates, business associates, friends, relatives, family members or strangers.

The clause sounds very reasonable in the post-9/11 world and makes perfect legal sense. After all, any disclosure made (unintentionally) under conditions listed in this clause would not be the fault of the Recipient and it would only be for The Good of The Country and The World and The Humanity (as everyone knows and agrees to).

I have one doubt, however. Won’t the addition of this clause almost nullify everything else in this agreement to mutual confidentiality?

But the clause is required. Isn’t it?

And what about that poor thing, The Market?

Is it already being forgotten in favour of other things?

Advertisements

November 17, 2010

So Dissent is Just a Disease After All

If you are even a little bit well read, you might have come across the name of Bertolt Brecht, even if you don’t recall it now. He is well known as one of the most important figures of twentieth century theatre (theater for the more dominant party). But his influence goes far beyond theatre. It extends to movies, literature, poetry (he was also a poet), political thought and so on (not excluding the Monty Pythons). It even goes beyond the boundaries of the East-West or the North-South divides. I wasn’t surprised at all when I read yesterday that there are ’30 something’ MA theses in South Korea alone (written in Korean) on Brecht. In India, he has been widely written about and heavily quoted by intellectuals, especially those writing in Indian languages. One of the most respected Hindi poets, Nagarjun, even wrote a poem about Brecht. I would have loved to provide a translation of that poem here, but I don’t feel equal to the task as the poem uses words whose equivalents in English I am unable to think of. Some poems are translatable, some are not.

Brecht has been on my mind these days as I have translated some of his poems (from English) into Hindi in the last few days. This excercise included a bit of surfing the Net for his name too and as a result, I came across something that made me write this. Or, at least, acted as a catalyst or the precipitating agent for writing this.

I don’t mean to present a brief bio of the man here. You can easily find plenty of material about him on the Internet and in any good library. I am not even a minor expert (in the technical sense) on him or his works. But I might mention here that some of the things he is known specifically for, include these:

  • His plays and his active theatre work (in particular the ‘epic theatre’ works like The Life of Galileo, The Threepenny Opera and Mother Courage and Her Children)
  • His theory about theatre, which is centred around the idea of the ‘alienation effect’
  • His poetry
  • His affiliation to Marxism (though of the dissident kind)

It should not be hard to guess now (if you were unfamiliar with him earlier) that it is the fourth point that would get most people interested, either approvingly or otherwise. You write plays, you do theatre, you pen poems, that’s all quite alright. No problem. Have your fun. Let us have some too. We can spend time discussing and arguing about it too. But being a Marxist is taking this business to a different territory. That’s politics. That might lead to talk of revolution. Or, at least, to that of radical change.

And so it does. Intellectuals, artists and activists around the world who are not satisfied of being a real or potential (‘wannabe’) Salman Rushdie or V. S. Naipaul and who want to do or say something more about the injustices in the world, in the society, in the institutions, have almost all paid at least some attention to this guy. Some disagreed and turned away, some agreed wholeheartedly and became loyal followers and some agreed partly and adapted his ideas and techniques according to their own taste and their own views about things. One from the last kind is also someone with whom I have happened to be concerned recently. That one was Fassbinder, a prolific filmmaker from the same part of the world as Brecht. Another filmmaker (from India) of this kind was Ritwik Ghatak. But about them, later.

Brecht’s ideas about ‘epic theatre’ (the quotes are there because it is a specific theory or a specific kind of theatre, not necessarily what you would guess from the words: it is a technical term) were a result of synthesizing and extending the ideas of Erwin Piscator and Vsevolod Meyerhold.

About the alienation effect, this excerpt from the Wikipedia article on Brecht gives a fairly good introduction:

One of Brecht’s most important principles was what he called the Verfremdungseffekt (translated as “defamiliarization effect”, “distancing effect”, or “estrangement effect”, and often mistranslated as “alienation effect”). This involved, Brecht wrote, “stripping the event of its self-evident, familiar, obvious quality and creating a sense of astonishment and curiosity about them”. To this end, Brecht employed techniques such as the actor’s direct address to the audience, harsh and bright stage lighting, the use of songs to interrupt the action, explanatory placards, and, in rehearsals, the transposition of text to the third person or past tense, and speaking the stage directions out loud.

But more than this somewhat technical aspect, what attracts me to the ‘Brechtian’ art, was expressed extremely well by Erwin Piscator in 1929:

For us, man portrayed on the stage is significant as a social function. It is not his relationship to himself, nor his relationship to God, but his relationship to society which is central. Whenever he appears, his class or social stratum appears with him. His moral, spiritual or sexual conflicts are conflicts with society.

I read this only today, but as my (few) readers might have noticed (which I explicitly expressed once), almost all of what I write here is about ‘Individual and Society’ (which is also one of the most common tags that I use). For me, the above is the crux of the Brechtian enterprise. But I should add that in my opinion the Brechtian technique, along with its variants, is not the only technique for achieving the goal (for expression in art as well as for scholarly investigation) outlined in the above quotation. Still, I can’t resist saying here that it is the key to understanding Fassbinder. Many a reviewer of Fassbinder movies has made a fool of himself by ignoring this.

Having provided this little context, I will move now to the thing that precipitated this article. Yesterday, after posting one more of the translations of his poems on a blog, I came across a post that pointed me to a news story from Reuters. Since it is from Reuters, it has been carried by many other news outlets.

The story reports that a researcher from the University of Manchester “has uncovered the truth behind the death of German playwright Bertolt Brecht”. It goes on to say:

Professor Stephen Parker … said the playwright died from an undiagnosed rheumatic fever which attacked his heart and motorneural system, eventually leading to a fatal heart failure in 1956.

Previously it was thought his death in 1956 aged 58 had been caused by a heart attack.

So far, so good. But here is the precious bit:

Parker said the playwright’s symptoms such as increased heart size, erratic movements of the limbs and facial grimace and chronic sore throats followed by cardiac and motorneural problems, were consistent with a modern diagnosis of the condition.

“When he was young no one could get near the diagnosis,” Parker, 55, told Reuters. “Brecht was labeled as a nervous child with a ‘dicky’ heart, and doctors thought he was a hypochondriac.”

Brecht’s childhood condition continued to affect him as an adult, making him more susceptible to bacterial infections such as endocarditis which affected his already weakened heart, and kidney infections which plagued him until the end of his life.

Parker believed that his underlying health altered the way the playwright felt and acted.

“It affected his behavior, making him more exaggerated in his actions, and prone to over-reaction,” he said. “He carried the problem all his life and compensated for this underlying weakness by projecting a macho image to show himself as strong.”

I have quoted at this length because I didn’t want to lose anything in the paraphrase. So this researcher is a medical doctor? Wrong. He is an expert in German Literature. And he derived all these conclusions from Brecht’s medical records. The report ends with this gem:

“Going into this project I felt I didn’t really fully understand Brecht,” he said. “This knowledge about his death opens a lot of new cracks about the playwright, and gives us a new angle on the man.”

As the Americans (and now even the Indians) say, Wow!

The Superman might have been fictional, but we now have a Super Researcher. Nothing short of real superpowers could have made him achieve this amazing feat: “his underlying health altered the way the playwright felt and acted”. Felt and acted! That is a nice summing up of the whole business of existence. The key to all this was rheumatic fever! This would make a nice present to an absurdist poet looking for ideas. An expert in German Literature goes through the medical records of a man who was born in 1898 and died in 1956, having lived in various countries during one of the most tumultuous periods in history (when there were no computers: well, hardly). He (the Expert) felt “he didn’t really fully understand” Brecht and by going through these medical records (one of the key exhibits being an X-ray) and found out that all this ‘epic theatre’ and the ‘alienation effect’ and affiliation to Marxism and his poetry and his immeasurable influence on a large fraction of the best minds of the world for the last three quarters of a century was just the result of his rheumatic fever. All his politics was just a simple disease.

As if this wasn’t enough, there is something else that would have caused cries of “Conspiracy theory!” if a different party was involved in the affair. His research shows that the 1951 X-ray report, which showed an enlargement to the left side of Brecht’s heart, was never shown to the playwright or known about by his doctors and it may have been (emphasis mine) held back by the German security services, the Stasi, who had a grudge against the playwright.

So all of you loony lefties, you commie fairies, this idol of yours was just a sick man. And if he was not, well, then he was at least (indirectly) killed by a communist government. So wake up, man! Give up all this talk about the individual and the society and injustice and imperialism etc. Get back on track and let’s live up the market dream together. We can change things. Yes, we can.

To be fair to Professor Parker, he has written a ‘literary biography’ of Brecht and it might be that he is not really claiming all of the above. However, what matters in the world outside the closed academic circle of experts on German Literature, is the effect of the reports of this study on the common readers. And what appears in these reports is, to use a word from the report itself, quite a sinister subtext. The Indian media right now is full of such reports (often of a much cruder, laughably cruder, moronically cruder variety) with similar, barely concealed subtexts, with obvious relevance to the current political situation in the country.

The ‘study’ apparently says nothing about the effect that his blacklisting in Hollywood might have had on him. Did the FBI (or any of the other agencies) had a grudge against him? Here was one of the most admired and influential playwright who had sketched notes for numerous films, but he got to write the script of only one movie that was directed by Fritz Lang. He was interrogated by the House Un-American Activities Committee (HUAC) and decided to leave the US after that. He lived during the period when his country went mad and so did the world, with millions upon millions dying. He saw Germany descend from relative decency into barbarism. He later also saw the degeneration of the revolution in the Eastern Block. Did all that have anything to do with what he was and may be even with why he died relatively young? Parker doesn’t seem interested in such trivialities and externalities. At least Reuters doesn’t, because I don’t have access to the complete and original ‘study’ as written by Parker.

Very long ago, I had read one of the novels by that great favourite of those looking for gentlemanly humour, P. G. Wodehouse. In that novel (whose name I don’t remember), one of the main characters (Jeeves, perhaps) decides to go, for some reason, on a kind of fast. And from the time of the very next meal, his whole personality starts changing. He becomes dissatisfied with lot of things. He starts finding faults in everything. His good nature is all gone. In short, he becomes the caricature of a dissenter.

Finally, when things go beyond a point, the plot has him give up the fast, may be with some persuasion from others. As soon as he has had a good meal again, he reverts to his usual self. The dissenter is gone. Then comes an editorial comment from the narrator which goes something like this: If only Gandhi (no ‘Red Top’, as you probably know) were to give up his fasting antics, he won’t be creating so many unnecessary problems. As far as Wodehouse is concerned, he has won the argument against the whole idea of Indian independence and whatever else Gandhi said he was fighting for.

But we shouldn’t be too hard on poor Wodehouse, as cautioned by Orwell in his defense, because, for one thing, the humourist was just too innocent of political awareness.

A scholar of Brecht and one of the biggest news agencies in the world, however, belong to a different category.

But this is not such a unique event. Parker has just given a new meaning to the idea of pathologizing troublesome people. To the idea of ‘finding dirt’ on people who don’t follow the rules of the game. It is just a sophisticated version of the understated witch hunt against Julian Asange. A small attempt at rewriting History in somewhat Orwellian sense. The motivation is all there, as more and more people start talking about the ‘churning’ and ‘renewed stirrings’ for a more fair world. Yet another facet of the psychological operations (psyops) in these times of the gold rush.

(Using Bob Dylan’s words, we could say that Professor Parker is perhaps just a pawn in their game, but of a different kind than Wodehouse was for the Nazis.)

 

One of the significant influences on Brecht was Chaplin’s movie The Gold Rush.

Life is full of poetry and drama.

And melodrama.

June 4, 2010

Shooting Oneself in the Foot

A few years ago I had received some feedback from someone about a research paper that I was going to submit to a major conference. Paraphrasing the feedback (repeating the exact words, even with the reference, will be copying: won’t it?), I was told that there was something that I had put in the paper, which, if I insisted on retaining, might make the reviewer look at my paper in a negative light. So, if I didn’t remove that part, I would be shooting myself in the foot.

This is beside the point, but I thought what I had added was correct and so I retained it. The paper was rejected, but I would like to believe that the reason for rejection was not that I had shot myself in the foot.

Getting back to the point, this is an expression that I have come across innumerable times, mostly directed at others, but sometimes directed at me. As a person who claims to be a writer, translator as well as a researcher in a language related discipline (among other things), I can’t help obsessing about how such expressions are used and what they mean, what they show and what they hide.

But I am not interested in writing an academic paper about that. So I write something here. And you are not supposed to review this piece when I submit the next Computational Linguistics paper which might come to you for review. (See the comment functionality below?).

Recently, Chomsky used this expression in a speech, saying ‘those who are being harmed are shooting themselves in the foot’. Now, most of the time that I have come across this expression, I have thought it was being used cynically to show something which wasn’t there and to hide something that was there. Or for some other questionable purposes. However, the people using this expression were mostly respectable well meaning people. Most probably they hadn’t thought about this expression in the way that I had done. May be because if they were to do it, they would be shooting themselves in the foot.

But when Chomsky uses this expression, I can’t but believe that he is using it to mean something sensible, not cynical (if this last part looks strange to you, look up the meanings and histories of these two words, especially the second one).

I do believe that what Chomsky said was basically correct. That is, there are some people who are being harmed and they are indeed shooting themselves in the foot (I am not sure whether I am one of them or not).

The reason I am writing this is that I also believe (based on evidence, not on faith) that such people are (relatively) so few that ridiculing them or offering them advice is hardly going to matter. I must add here that Chomsky did actually caution against ridiculing such people (who have realized that they are being systematically harmed). He only expressed his disappointment that instead of doing something to stop this systematic harming, they are shooting themselves in the foot.

You see, there are also people who are being harmed and are shooting themselves in the head (or ‘consuming pesticide’). You might say that they belong to the same category because the expression is metaphorically wide enough to cover them. That might be true. But then there are also a far larger number of people who are being harmed and they are doing something very different.

They are not shooting themselves in the foot (or in the head). They are shooting others (who are also being harmed) in the foot*. Often they are also shooting others (who are also being harmed) in the head. Sometimes they are doing it for a few extra peanuts, sometimes just for the fun of it and sometimes because they have been led to believe that these targets are their enemies (or the enemies of the nation, or the enemies of the society, or of the religion, or of the community etc.). And since doing it openly is a bit problematic (not cool anymore, baby!), they often have to make it appear as if their target shot himself in the foot (or in the head), whether deliberately or accidentally.

* Perhaps they are programmed in Concurrent Euclid.

So, my take on the matter is that we should be talking about people who are being harmed and who are (literally or metaphorically) shooting others who are also being harmed, whether in the foot or in the head. Because without them, the whole shooting machinery probably won’t be able to operate. In fact, to visualize a grisly scenario, if all such people stopped shooting others (who are being harmed) and started only to shoot themselves in the foot, even then the shooting machinery will probably become dysfunctional. Fortunately, most of the people will not be interested in shooting themselves in the foot (or in the head) if they are just able to find any feasible alternative. Unfortunately, no one from above can tell a person what such an alternative means in practical terms in that person’s circumstances and it’s very hard to find it out for oneself. It’s very hard to even be sure that such an alternative exists. If it does, it’s very hard to translate it into any meaningful action. Compared to a a few decades earlier, it is infinitely harder now, given the extraordinary consolidation of the global power structure (going far beyond what Foucault had studied up to his time), to a great extent due to the techno-administrative ‘advances’ (mostly in the name of security).

There are, surely, people who are being harmed but are not shooting others (being harmed or not being harmed). I won’t say anything about them right now.

(To academic busybodies and surface-style junkies: don’t bother to count the number of times the said expression has been used in this short piece: it has been done very deliberately. Perhaps the author was trying to shoot …).

 

 

For having read the above, here is a bonus link: Fascism then. Fascism now?

May 22, 2009

How Many Grams?

There is an automatically (intelligently) generated blog which I have read recently.

It appears to be (let’s give ‘seems’ some rest) quite a popular one in a certain section.

I know the corpus on which it was trained.

And the corpus on which it was retrained.

(Including most of the quotes and the comments, especially the long ones).

But I wonder whether the order of n-grams was five or six.

It is definitely better than four grams.

It could even be Se7en.

This brings up a new idea.

What about writing a paper on automatically guessing the order of n-grams, given some generated text?

It may be difficult in the general case, but in our case we know the corpus on which it was trained.

Any takers?

April 16, 2009

Accepted, but not Published

Academicians or researchers list their publications prominently on their home pages. After all, it is supposed to represent the best of their work. They also quite often (especially those who have a large number of publications) categorize them according to some criteria like the venue (workshop, conference, journal or book: in the reverse order of prominence) or peer review (unrefereed and refereed).

In this post we propose that there should be a new category of publications. This category is needed because a lot of researchers (for good or for bad) now come from underprivileged countries. For most of these researchers, traveling abroad to attend a conference, even if their paper has been accepted, is something very hard to do. In some sense even more than getting a paper accepted, which is relatively harder too, given the lack of certain privileges — whether you like the word or not — generous research grants, infrastructure, language resources etc., combined with the prejudice (it is there: I am not inventing it, whoever might be blamed for it). To these problems can be added the problem of compulsory attendance at a conference or a workshop. It is partly these conditions which have prompted suggestions from certain quarters that researchers from these countries should concentrate on journal papers (never mind the delay and difficulties involved or the unfairness of the proposition, even though it has some practical justification).

But you can never be sure while submitting that you certainly won’t be able to attend. Also, hope is said to be a good thing. Therefore, the event of a researcher submitting a paper and hoping to attend but not being able to attend cannot be ruled out.

This bring us to the proposal mentioned earlier. One solution to this problem is that there should be another category of papers: accepted but not published, because the author couldn’t afford to attend the conference or the workshop. (By the way, workshops are the most happening places nowadays: more on that later).

The author of this post must know because he has authored more than one such publications.

Of course, the condition will be that if and when such a paper is resubmitted (with or without modifications, but without any substantial new work), accepted again and finally published, the entry marked as ‘accepted’ should be removed and replaced by an entry marked as ‘published’.

After all, if we are serious about research, then the work (which has been peer reviewed and accepted) should be given somewhat more importance than some pages printed in some proceedings (or attendance in a conference for that matter).

This, of course, doesn’t mean that you can get basically the same thing published (or accepted) in more than one places.

(Sorry for the Gory Details)

P.S.: May be there is no need for the above apology as the depiction of the Gory Details of the Indian Reality is now getting multiple Oscars (The Academy Awards: the keyword is Academy). But may be there is because some researchers have a more (metaphorically) delicate constitution which can be hurt by the Gory Details.

Queen’s P.S.: Off with his head!

February 22, 2009

बाल की खाल

ज्ञान-विज्ञान के विकास में लगे
अति-विशेषज्ञ का काम है
बाल की खाल निकालना
इसके बहुत से लाभ हो सकते हैं
लेकिन तभी तक
जब तक खाल निकाल कर
बाल के अंदर की कोशिका के
अध्ययन में डूबे हुए
यह न भुला दिया जाए
कि इसी बाल में ऐसी
अनेकों कोशिकाएँ हैं
कि इन कोशिकाओं के ऊपर
खाल भी थी
जो निकाल दी गई
और जिसको मिला कर ही
एक पूरा बाल बनता है
कि ऐसे लाखों बालों की जड़
एक सिर पर स्थित है
और यह सिर
कई और अंगों के साथ मिलाकर
एक शरीर बनाता है
और ऐसे अरबों शरीर मौजूद हैं
यही नहीं, तरह-तरह के अन्य शरीर भी हैं
जिनमें से प्रत्येक
बड़ी संख्या में
(लुप्त होती प्रजातियों के अलावा)
पाये जा सकते हैं

ये सभी शरीर
एक बड़े-से (या छोटे-से) गोले पर रहते हैं
जिस पर शरीरों के अतिरिक्त भी बहुत कुछ है
और ऐसे अनगिनत गोले
इधर-उधर चक्कर लगाते फिर रहे हैं
इनमें से बहुतों पर
शरीर हो सकते हैं
जिन पर सिर हो सकते हैं
सिरों पर बाल हो सकते हैं
बालों पर (खाल निकालने के बाद)
कोशिकाएँ भी मिल सकती हैं
जो शायद वैसी ही हों
जैसी का अध्ययन किया जा रहा है
या शायद ना भी हों

बाल के अंदर की कोशिका के
अध्ययन में डूब कर
सब कुछ भुला देने की
ग़लती न करना तो ठीक है
लेकिन यह भुलाना भी
खतरे से ख़ाली नहीं है
कि जिस अनगिनत गोलों के
ब्रह्मांड के बारे में
बात की जा रही है
उसमें से कुछ पर ही
शरीर पाये जाते हैं
जिनके सिर
हो भी सकते हैं, नहीं भी
और सिर पर बाल (यदि हों तो)
उनके अंदर सूक्ष्म कोशिकाएँ
मिल सकती हैं
जिनके अध्ययन से
ऐसे निष्कर्ष निकल सकते हैं
जो ब्रह्मांड (या उसके कुछ भाग)
के बारे में दिए जा रहे
निर्णयों-फ़तवों को
ग़लत साबित कर सकते हैं

 

[1997 या 1998]

October 28, 2008

सांगणिक भाषाविज्ञान

जैसा मैंने पिछली प्रविष्टी (‘पोस्ट’ के लिए यह शब्द इस्तेमाल हो सकता है?) में लिखा था, अगले कुछ हफ्तों में मैं संचय के बारे में लिखने जा रहा हूं।

लेकिन क्योंकि संचय खास तौर पर (आम उपयोक्ताओं के अलावा) सांगणिक भाषाविज्ञान या भाषाविज्ञान के शोधकर्ताओं के लिए बनाया गया है, इस बात को साफ कर देना ठीक रहेगा कि सांगणिक भाषाविज्ञान या भाषाविज्ञान के माने क्या है, या अगर आप इनके माने जानते ही हैं तब भी इनसे मेरा अभिप्राय क्या है। यह दूसरी बात इसलिए कि इन विषयों (सांगणिक भाषाविज्ञान या भाषाविज्ञान) के अर्थ के बारे में आम लोगों में तो तमाम तरह की ग़लतफ़हमियाँ हैं ही, पर इन विषयों के शोधकर्ताओं में भी इनकी परिभाषा पर एक राय नहीं है।

सच तो यह है कि हिंदी जगत में तो अब भी अधिकतर लोग भाषाविज्ञान का अर्थ उस तरह के अध्ययन से लगाते हैं जो पिछली सदी के शुरू में लगाया जाता था। लेकिन बहस की इस दिशा में अभी मैं नहीं जाना चाहूंगा क्योंकि इसके बारे में कहने को इतना अधिक है कि अभी जो उद्देश्य है वो पीछे ही रह जाएगा।

वैसे सांगणिक भाषाविज्ञान या भाषाविज्ञान की परिभाषा या उनकी सीमाओं के बारे में भी कहने को बहुत-बहुत कुछ है, पर फिलहाल थोड़े से ही काम चलाया जा सकता है।

तो छोटे में कहा जाए तो भाषाविज्ञान शोध या अध्ययन का वह विषय है जिसमें किसी एक भाषा के व्याकरण का ही अध्ययन नहीं किया जाता बल्कि नैसर्गिक या मानुषिक (यानी कृत्रिम नहीं) भाषा का वैज्ञानिक रूप से अध्ययन किया जाता है। अब यह धारणा व्यापक रूप से स्वीकृत है कि मानव मस्तिष्क की संरचना का भाषा की संरचना से सीधा संबंध है और क्योंकि सभी मानवों के मस्तिष्क की संरचना मूलतः एक ही जैसी है, तो सभी नैसर्गिक या मानुषिक भाषाओं में भी सतही लक्षणों को छोड़ कर बाकी सब एक ही जैसा है। इसीलिए, जैसा कि इन विषयों के आधुनिक साहित्य में प्रसिद्ध है, अगर किसी अमरीकी के शिशु को जन्म के तुरंत बाद कोई चीनी परिवार गोद ले ले और वह बच्चा चीन में ही पले तो वह उतनी आसानी से चीनी बोलना सीखेगा जितनी आसानी से कोई चीनी परिवार का बच्चा। ऐसी ढेर सारी और बातें हैं, पर मुख्य बात है कि भाषाविज्ञान नैसर्गिक या मानुषिक भाषा का वैज्ञानिक अध्ययन है।

कम से कम कोशिश तो यही है कि अध्ययन वैज्ञानिक रहे, पर वो वास्तव में रह पाता है या नहीं, यह बहस का विषय है।

अब सांगणिक भाषाविज्ञान पर आएं तो इस विषय में हमारा ध्यान मानवों की बजाय संगणक यानी कंप्यूटर पर आ जाता है, पर पिछली शर्त फिर भी लागू रहती है: नैसर्गिक या मानुषिक भाषा का वैज्ञानिक अध्ययन। अंतर यह है कि हमारा उद्देश्य अब यह हो जाता है कि कंप्यूटर को इस लायक बनाया जा सके कि वो नैसर्गिक या मानुषिक भाषा को समझ सके और उसका प्रयोग कर सके। जाहिर है यह अभी बहुत दूर की बात है और इसमें कोई आश्चर्य भी नहीं होना चाहिए क्योंकि अभी भाषाविज्ञान में ही (पिछली सदी की असाधारण उपलब्धियों के बाद भी) वैज्ञानिक ढेर सारी बाधाओं में फंसे हैं।

फिर भी, सांगणिक भाषाविज्ञान में काफ़ी कुछ संभव हो चुका है और काफ़ी कुछ आगे (निकट भविष्य में) संभव हो सकता है। लेकिन इसमें कंप्यूटर का मानव जैसे भाषा बोलना-समझना शामिल नहीं है। जो शामिल है वो हैं ऐसी तकनीक जो दस्तावेजों को ज़्यादा अच्छी तरह ढूंढ सकें, उनका सारांश बना सकें, कुछ हद तक उनका अनुवाद कर सकें आदि।

लेकिन हिंदुस्तानी परिप्रेक्ष्य में परेशानी यह है कि हम अभी इस हालत में भी नहीं पहुंचे हैं कि आसानी से कंप्यूटर का एक बेहतर टाइपराइटर की तरह ही उपयोग कर सकें। इस दिशा में कुछ उपलब्धियाँ हुई हैं, पर अंग्रेज़ी या प्रमुख यूरोपीय भाषाओं की तुलना में हम कहीं भी नहीं हैं। जैसा कि आपमें से अधिकतर जानते ही हैं, यह एक लंबी कहानी है जिसे अभी छोड़ देना ही ठीक है।

पर संचय का विकास इसी परिप्रेक्ष्य में किया गया है, जिसके बारे में आगे बात करेंगे।

October 26, 2008

संचय का परिचय

पिछली पोस्ट (शर्म के साथ कहना पड़ रहा है कि पोस्ट के लिए कोई उपयुक्त शब्द नहीं ढूंढ पा रहा हूं) में मैंने (अंग्रेज़ी में) संचय के नये संस्करण के बारे में लिखा था। मज़े की बात है कि संचय के बारे में मैंने अभी हिंदी में शायद ही कुछ लिखा हो। इस भूल को सुधारने की कोशिश में अब अगले कुछ हफ्तों में संचय के बारे में कुछ लिखने का सोचा है।

तो संचय कौन है? या संचय क्या है?

पहले सवाल का तो जवाब (अमरीकी शब्दावली में) यह है कि संचय एक सिंगल पेरेंट चाइल्ड है जिसे किसी वेलफेयर का लाभ तो नहीं मिल रहा पर जिस पर बहुत सी ज़िम्मेदारियाँ हैं।

दूसरे सवाल का जवाब यह है कि संचय सांगणिक भाषाविज्ञान (कंप्यूटेशनल लिंग्विस्टिक्स) या भाषाविज्ञान के क्षेत्र में काम कर रहे शोधकर्ताओं के लिए उपयोगी सांगणिक औजारों का एक मुक्त (मुफ्त भी कह सकते हैं) तथा ओपेन सोर्स संकलन है। पर खास तौर से यह कंप्यूटर पर भारतीय भाषाओं का उपयोग करने वाले किसी भी व्यक्ति के काम आ सकता है। इसकी एक विशेषता है कि इसमें नयी भाषाओं तथा एनकोडिंगों को आसानी से शामिल किया जा सकता है। लगभग सभी प्रमुख भारतीय भाषाएं इसमें पहले से ही शामिल हैं और संचय में उनके उपयोग के लिए ऑपरेटिंग सिस्टम पर आप निर्भर नहीं है, हालांकि अगर ऑपरेटिंग सिस्टम में ऐसी कोई भी भाषा शामिल है तो उस सुविधा का भी आप उपयोग संचय में कर सकते हैं। यही नहीं, संचय का एक ही संस्करण विंडोज़ तथा लिनक्स/यूनिक्स दोनों पर काम करता है, बशर्ते आपने जे. डी. के. (जावा डेवलपमेंट किट) इंस्टॉल कर रखा हो। यहाँ तक कि आपकी भाषा का फोंट भी ऑपरेटिंग सिस्टम में इंस्टॉल होना ज़रूरी नहीं है।

संचय का वर्तमान संस्करण 0.3.0 है। इस संस्करण में पिछले संस्करण से सबसे बड़ा अंतर यह है कि अब एक ही जगह से संचय के सभी औजार इस्तेमाल किए जा सकते हैं, अलग-अलग स्क्रिप्ट का नाम याद रखने की ज़रूरत नहीं है। कुल मिला कर बारह औजार (ऐप्लीकेशंस) शामिल किए गए हैं, जो हैं:

  1. संचय पाठ संपादक (टैक्सट एडिटर)
  2. सारणी संपादक (टेबल एडिटर)
  3. खोज-बदल-निकाल औजार (फाइंड रिप्लेस ऐक्सट्रैक्ट टूल)
  4. शब्द सूची निर्माण औजार (वर्ड लिस्ट बिल्डर)
  5. शब्द सूची विश्लेषण औजार (वर्ड लिस्ट ऐनेलाइज़र ऐंड विज़ुअलाइज़र)
  6. भाषा तथा एनकोडिंग पहचान औजार (लैंग्वेज ऐंड एनकोडिंग आइडेंटिफिकेशन)
  7. वाक्य रचना अभिटिप्पण अंतराफलक (सिन्टैक्टिक ऐनोटेशन इंटरफेस)
  8. समांतर वांगमय अभिटिप्पण अंतराफलक (पैरेलल कोर्पस ऐनोटेशन इंटरफेस)
  9. एन-ग्राम भाषाई प्रतिरूपण (एन-ग्राम लैंग्वेज मॉडेलिंग टूल)
  10. संभाषण वांगमय अभिटिप्पण अंतराफलक (डिस्कोर्स ऐनोटेशन इंटरफेस)
  11. दस्तावेज विभाजक (फाइल स्प्लिटर)
  12. स्वचालित अभिटिप्पण औजार (ऑटोमैटिक ऐनोटेशन टूल)

अगर इनमें से अधिकतर का सिर-पैर ना समझ आ रहा हो तो थोड़ा इंतज़ार करें। आगे इनके बारे में अधिक जानकारी देने की कोशिश रहेगी।

शायद इतना और जोड़ देने में कोई बुराई नहीं है कि संचय पिछले कुछ सालों से इस नाचीज़ के जिद्दी संकल्प का परिणाम है, जिसमें कुछ और लोगों का भी सहयोग रहा है, चाहे थोड़ा-थोड़ा ही। उन सभी लोगों के नाम संचय के वेबस्थल पर जल्दी ही देखे जा सकेंगे। ये लगभग सभी विद्यार्थी हैं (या थे) जिन्होंने मेरे ‘मार्गदर्शन’ में किसी परियोजना – प्रॉजेक्ट – पर काम किया था या कर रहे हैं।

उम्मीद है कि संचय का इससे भी अगला संस्करण कुछ महीने में आ पाएगा और उसमें और भी अधिक औजार तथा सुविधाएं होंगी।

October 5, 2008

Good News and Bad News on the CL Front

First, as the saying goes, the bad news. We had submitted a proposal for the Second Workshop on NLP for Less Privileged Languages for the ACL-affiliated conferences. That proposal has not been accepted. Total proposals submitted were 41 and 34 out of them were accepted. Ours was among the not-accepted seven (euphemisms can be consoling).

Was is that bad? I hope not.

Don’t those capital letters look silly in the name of a rejected proposal?

Now the good news. The long awaited new version of Sanchay has been released on Sourceforge. (Well, at least I was awaiting). This version has been named (or numbered?) 0.3.0.

The new Sanchay is a significant improvement over the last public version (0.2). It now has one main GUI from which all the applications can be controlled. There are twelve (GUI based) applications which have been included in this version. These are:

  • Sanchay Text Editor that is connected to some other NLP/CL components of Sanchay.
  • Table Editor with all the usual facilities.
  • A more intelligent Find-Replace-Extract Tool (can search over annotated data and allows you to see the matching files in the annotation interface).
  • Word List Builder.
  • Word List FST (Finite State Transducer) Visualizer that can be useful for anyone working with morphological analysis etc.
  • One of the most accurate Language and Encoding Identifier that is currently trained for 54 langauge-encoding pairs, including most of the major Indian languages. (Yes, I know there is a number agreement problem in the previous sentence).
  • A user friendly Syntactic Annotation Interface that is perhaps the most heavily used part of Sanchay till now. Hopefully there will be an even more user friendly version soon.
  • A Parallel Corpus Annotation Interface, which is another heavily used component. (Don’t take that ‘heavily’ too seriously).
  • An N-gram Language Modeling Tool that allows you to compile models in terms of bytes, letters and words.
  • A Discourse Annotation Interface that is yet to be actually used.
  • A more intelligent File Splitter.
  • An Automatic Annotation tool for POS (Part Of Speech) tagging, chunking and Named Entity Recognition. The first two should work reasonably well, but the last one may not be that useful for practical purposes. This is a CRF (Conditional Random Fields) based tool and it has been trained for Hindi for these three purposes. If you have annotated data, you can use it to train your own taggers and chunkers.

All these components use the customizable language-encoding support, especially useful for South Asian languages, that doesn’t need any support from the operating system or even the installation of any fonts, although these can still be used inside Sanchay if they are there.

More information is available at the Sanchay Home.

The capitals don’t look so bad for a released version.

The downside of even this good news is that my other urgent (to me) work has got delayed as I was working almost exclusively on bringing out this version for the last two weeks or so.

But then you need a reason to wake up and Sanchay is one of my reasons. And I can proudly say that a half-hearted attempt to generate funding for this project by posting it on Micropledge has generated 0$.

Sanchay is still alive as a single parent child without any welfare but with a lot of responsibilities.

Now I can have nightmares about the bugs.

June 13, 2008

Sharing Yves Montand’s Gift

I can’t resist sharing this legendary song by a legendary singer. It’s possible for you to watch him sing this song which was introduced by him a long ago but has since been sung by innumerable singers, including his mentor Edith Piaf.

It’s called ‘Les Feuilles Mortes’ (‘Autumn Leaves’ in English) and is based on a poem by Jacques Prévert and has music by Joseph Kosma. I am sure a part of the tune has been used in an old Hindi song, but I am just not able to place that song.

This is also a gift from technology. There are people who, over the decades, have helped in the development of technology for this. And there are people who have helped make something like ‘precision’ (and/or) cluster bombs.

Perhaps the intersection between the two sets is quite large.

Did they have to? Necessarily?

By the way, here is the link for the residents of IIIT, Hyderabad who won’t be able to see the video above as the youtube site is banned there.

I mean here.

Too dangerous a technology.

But the in.youtube site (which was inaugurated with news stories in the national mainstream media) is not banned so far. I hope nationalism ensures that it remains unbanned. It should be of some use. Nationalism. Earn its keep. If it works hard enough.

Unfortunately, WordPress doesn’t recognize the in.youtube site.

But nationalism has not saved the India Together site from being banned. And the funny thing is that I am perhaps the only person on the campus who tries to access this site.

While I am at it, I may as well share a song by Edith Piaf.

Next Page »

Create a free website or blog at WordPress.com.