Secret and Urgent - The Story of Codes and Ciphers. Anon
Читать онлайн книгу.tables are the most important tools of the cryptographer. They are tables showing the relative frequencies of letters, pairs of letters, triplets (trigrams), syllables or words in normal text.
CHAPTER I
SERMONS IN STONES
I
THE Japanese language uses three systems of writing, quite unlike one another, to express the same verbal sounds. In Turkey recently Mustafa Kemal Atatürk ordered that the language be changed over from the old Arabic script, with its system of curves and dots, to the more flexible and printable Latin characters. The whole nation had to relearn its letters, or, in other words, had to learn a new system of cipher, and it is quite possible that within a few more generations the Arabic Turkish will be unreadable to all but a few antiquarians.
What is happening to the most complicated of the three Japanese forms, and to Arabic Turkish, happened long ago to many other languages; that is, the key to the written cipher was lost. The result is that the problems of archaeology are those of cryptography, and occasionally the problem proves insoluble. For a couple of centuries explorers in Asia Minor have been copying from the rocks of that ancient land certain inscriptions which were undoubtedly carved by a race known in the Bible as the Hittites and to the Egyptians as the Khita, but knowledge of the language in which they were written and the system on which it is constructed is so utterly lacking that the inscriptions have never been interpreted. The urns and tombs of that mysterious Etruscan race which preceded the Romans in central Italy have also furnished inscriptions that have thus far defied analysis.
Our knowledge of the language and hence of the civilization and history of ancient Persia might be as tenuous as that of the Hittites and Etruscans but for the greatest single task of decipherment ever performed, a job that took the entire lifetimes of a number of brilliant men. Their starting point was the Persian inscriptions, a considerable number of which had already been copied from the rocks of that land when the work began. Nobody knew the purpose of these inscriptions; scientists were so uncertain of their date that estimates varied across a sweep of twelve centuries, and of the language in which they were written it was known only that it was no longer spoken.
Carsten Niebuhr, a Danish archaeologist of the eighteenth century, was the first person to make any impression on what had been the impenetrable mystery of the Persian inscriptions. All the inscriptions were written in the cuneiform characters of Babylon, but the groupings of the little wedge-shaped units which made up these characters were very different. Long examination convinced Niebuhr that there were three main classes, which he unromantically designated as inscriptions of types I, II and III.
This classification stood up throughout the inscriptions—that is, no grouping of wedges in a type I inscription was ever found in an inscription of type II or III, and vice versa.
Since the type I inscriptions were much the most numerous, Niebuhr concentrated on them in the effort to find the key of the cipher. Type I exhibited a special characteristic not found in the other two: the points of the wedges with which the characters were written were always directed to the right or downward. Niebuhr therefore suggested that the language in which these inscriptions were written should read from left to right, like modern European tongues, and not in the opposite direction, like all the Oriental languages then known. He then proceeded to compile tables of the characters and their relative frequency. There were forty-two different characters; he therefore assumed that the unknown tongue was written with an alphabet of forty-two letters, and, having spent forty years in making discoveries which have been described in three paragraphs, died.
Niebuhr’s pupil, Tychsen, took up the work where his preceptor left off, using as a basis the tables the older man had compiled. Tychsen noted that one of the forty-two letters, an isolated single wedge pointing diagonally downward, accounted for over twenty-five per cent of the total number of characters. He had compiled tables showing the frequency of letter occurrence in modern languages in the hope of getting some help from analogy, and had found that the highest frequence of any letter in any modern language was the seventeen per cent for E in French. The idea that any one letter in a forty-two-letter alphabet could constitute twenty-five per cent of the whole language struck him as irrational. However, all the characters in all the inscriptions were strung together without gaps. If this slanting wedge were a conventional sign indicating the gap between the end of one word and the beginning of the next, twenty-five per cent would be just about right. Tychsen therefore accepted the hypothesis that this was the case, and passed on to another step.
This step was based on his assumption that the type I inscriptions belonged to the age of the Parthian kingdom in Persia, contemporary with the Roman Empire. There had been several kings of Parthia named Arsaces, and one of them was known, from Roman accounts, to be particularly fond of building monuments and leaving his name around on them where people could see it. Tychsen therefore guessed that a certain word, which had the right number of characters and was very frequently repeated in the inscriptions, was the name of Arsaces. If this were true, the first character in the word would be pronounced as A and so on; and by putting an A wherever else this character occurred in the inscriptions, one would eventually find other names partially cleared, be able to fill up the gaps, and so eventually to solve the whole alphabet. He tried it on this system; it gave him nothing but gibberish and, still getting gibberish, Tychsen died, worn out and discouraged.
His failure discouraged further inquiry for a number of years, or until one of those persistent and remorselessly logical German investigators took the matter up. He was Georg Friedrich Grote-fend, a professor at the University of Göttingen. Looking over Tychsen’s work, he was struck by the fact that the Dane had gone at the task in so reasonable a manner that only a fundamental error in his presumptions could account for his utter failure.
Further examination convinced Grotefend that this error lay in dating the inscriptions. It was certain that all three types of inscriptions came from the same period, for there were in existence clay tablets on which all three types were present—that is, tablets on which the wedges had been impressed while they were still soft and then baked in. The wedges in the inscriptions of types II and III pointed to the left; therefore they must represent different languages from type I. But if one assumed that type I came from the Parthian period, there was no way of accounting for the other two languages, for in Parthian times only a single language had been spoken in Persia. The Parthians, a semi-barbarous people, would certainly not have bothered to translate their important announcements into foreign tongues for the benefit of casual travelers.
In fact, the long history of Persia held only one era when three languages had been current in the country—the era of the Persian Empire, when Median, Persian and Babylonian were on an almost equal footing. Grotefend therefore assumed that this was the correct date of the ancient inscriptions, and that type I was ancient Persian, the tongue of the ruling race, since it came first wherever the three were associated.
Tychsen’s identification of the diagonal stroke as a word division struck him as acceptable. He accepted it and passed to the consideration of some of the inscriptions in detail. In two of the longer texts he found a word occurring again and again, but in two different forms, a shorter form and one which reproduced it with the addition of a couple of letters. In the pair of texts Grotefend was examining the word appeared in both forms together, the shorter form being followed by the longer. The tables of word frequencies compiled by Tychsen showed that the shorter form appeared in all the inscriptions more frequently than any other word.
Now in most departments of human thought it is not permissible to make a hypothesis and then find facts to prop it up; but in cryptography this is frequently the only method that will work. Grotefend did what every cipherer does when confronted with a cipher which gives no starting point. He guessed at the “probable word”; and the word he guessed as so frequently recurring was king. The shorter form would be, then, simply king, the longer, the genitive form of kings, and the doubled word, king of kings.
This was not entirely a shot in the dark; in good cipher practice no probable word should be pure guess. Both the language and the culture of the medieval Persian, or Sassanid Empire, were well known to scientific men. It had