Cryptography, Information Theory, and Error-Correction. Aiden A. Bruen. Читать онлайн. MREADZ.NET

Cryptography, Information Theory, and Error-Correction. Aiden A. Bruen

Читать онлайн книгу.

В начало <18 19 20 21 22 23 24 25 26 27 >В конец

Cryptography, Information Theory, and Error-Correction - Aiden A. Bruen

Source: These percentages are based on the book “The Tragical History of Doctor Faustus”, by Christopher Marlowe, which is a book of approximately 100 000 characters that was chosen randomly from the Project Gutenberg.

Frequency analysis can be used for cryptanalysis. However, one needs a lot of craft in its use, along with any information that can be gathered about the contents of the message and the sender.

2.6 Breaking the Vigenère Cipher, Babbage–Kasiski

Now that the Vigenère cipher has been defined, we will show how to use the frequencies of letters in English, to break this cipher. We have two tasks.

Find the keyword length.

Find the keyword itself.

We have two methods to find the length of the keyword. The first method, the Babbage–Kasiski method, attempts to find repeated successive triples (or four‐tuples, or five‐tuples, etc.,) of letters in the cipher text. The second method treats the English language like a stationary or even ergodic source (see Chapter 11).

We will use two fundamental principles in carrying out our tasks.

“E” is the most frequent letter of the English language.

Informally, written English tends to “repeat itself.” This means that the frequencies of a passage starting in position 1 are similar to the frequencies of the passage starting in position when we slide the text along itself.

Once we obtain , the key‐length, we can find the keyword itself. We do this by using the first fundamental property above. Namely, we exploit the statistics of the letters in English, or pairs of letters (i.e. digrams), or trigrams, etc.

The second principle has two important interpretations. For the Babbage–Kasiski method, this means that if we find a repeated letter (or sequence of letters) in the cipher text there is a good chance that it comes from a given letter (or sequence of letters) in the plain text that has been enciphered by a given letter (or letters) in the key. Thus, there is a reasonable expectation that the distances between such repeated sequences of letters equals the key length or a multiple of the key length.

The second principle has an important implication in terms of our second method, called the method of “coincidences,” as well. The basic idea is explained in an example below.

The Babbage–Kasiski method

To demonstrate this method for finding , suppose we received the following cipher text:

EHMVL	VDWLP	WIWXW	PMMYD	PTKNF	RHWXS
LTWLP	OSKNF	WDGNF	DEWLP	SOXWP	HIWLL
EHMYD	LNGPT	EEUWE	QLLSX	TUP

Our task is to search for repetitions in the above text. For a small cipher text, the brute‐force method is not too difficult. We focus on trigrams, and highlight some as follows:

After having found these, we compute the distances between the trigrams.

EHM	‐	60
WLP	‐	25,15,40
XWP	‐	39
MYD	‐	45

We note that with the exception of 39, 5 divides all of the distances. In fact, if we proceed with frequency analysis, it turns out that we can decipher this message with a key length of 5. The codeword is “ladel,” and the plain text is “Thor and the little mouse thought that they should douse the house with a thousand liters of lithium” (Who said that secret messages have to have a clear meaning!) Frequency analysis is used in our next example below.

It is purely by chance that we had the repeated trigram WXP – this repeated trigram was not the result of the same three letters being enciphered by the same part of the keyword. This highlights the fact that the above method is probabilistic.

The method of coincidences

We will now use the second principle of “coincidences,” to find the length of the keyword. The sequence of plain text letters in positions 1 to , n plus 1 to 2 n , 2 n plus 1 , to 3 n , etc., should be approximately the same, especially if is large. It follows that a similar result holds true for the corresponding sequences of cipher text letters. Thus, if we slide the cipher text along itself and count the number of coincidences (i.e. agreements) at each displacement, then on average the maximum number of coincidences will occur after an integer multiple of the keyword length (i.e. the max occurs for some lamda n , lamda equals 1 comma 2 comma 3 comma ellipsis ). This technique can be illustrated using the following example: We will first determine the period and use it to determine the nature of the keyword from a cipher text passage:

Скачать книгу

В начало <18 19 20 21 22 23 24 25 26 27 >В конец

VVHZK	UHRGF	HGKDK	ITKDW	EFHEV	SGMPH
KIUWA	XGSQX	JQHRV	IUCCB	GACGF	SPGLH
GFHHD	MHZGF	BSPSW	SDSXR	DFHEM	OEPGI
QXKZW	LGHZI