Languages and Entropy

I’m a subscriber to the CERN rss updates, which is where I got todays post from. It is quite interesting and fascinates me how the “Law’s of Large Numbers” can be put to use in the modern world …

Archaeologists who find symbols inscribed on ancient objects are often faced with the difficult task of trying to decide if the “writing” corresponds to a language or not – that is, if they are simply a group of symbols rather than a proper piece of text. This was the case recently with a script from the Indus civilization, which existed in what is now eastern Pakistan and north-western India between about 2600 and 1900 BC.

The script has not yet been deciphered, but Rajesh P N Rao of the University of Washington and colleagues found that comparison of the conditional entropy yielded interesting results. The idea is that in a proper language symbols neither follow each other randomly nor are they always placed in the same order. Comparison of the Indus script places it much closer to natural linguistic systems such as English, Rig-Vedic Sanskrit and Old Tamil – and even programming code Fortran – than to non-linguistic systems such as human DNA or bacterial protein sequences.

So, the script from the Indus civilization appears indeed to be a language, and the fact that the conditional-entropy study places it close to Old Tamil might provide a clue to help decipher it one day.


