Skip to main content

Section D.2 Chapter 2 Quotes:

Al-Kindi on Frequency Analysis:

"Algorithms of Cryptanalysis

So we say, the enciphered letters are either in numerical proportions, that is poetry -because poetic meter, ipso facto, sets measures to the number of letters in each line-, or they are not. Non- poetry can be cryptanalyzed using either quantitative or qualitative expedients.

The quantitative expedients include determining the most frequently occurring letters in the language in which cryptograms are to be cryptanalyzed. If vowels functioned as the material from which any language is made, and non-vowels functioned as the shape of any language, and since many shapes can be made from the same material, then the number of vowels in any language would be greater than non- vowels. For instance, gold is the material of many shapes of finery and vessels; it may cover crowns, bangles, cups, etc.. The gold in these realizations is more than the shapes made of it. Similarly, the vowels which are the material of any kind of text are more than the non- vowels in any language. I mean by vowels the letters: (a), (y or i or e) and (o or u). Therefore the vowels in any language, inevitably, exceed in number the non-vowels in a text of that language. It happens that in certain languages some vowels are greater in number than some other vowels, while non-vowels may be frequent or scarce according to their usage in each language, such as the letter (s), of which frequency of occurrence is high in Latin.

Among the expedients we use in cryptanalyzing a cryptogram if the language is already known, is to acquire a fairly long plaintext in that language, and count the number of each of its letters. We mark the most frequent letter "first", the second most frequent "second", and the following one "third", and so forth until we have covered all its letters. Then we go back to the message we want to cryptanalyze, and classify the different symbols, searching for the most frequent symbol of the cryptogram and we regard it as being the same letter we have marked "first" -in the plaintext-; then we go to the second frequent letter and consider it as being the same letter we have termed "second", and the following one "third", and so on until we exhaust all the symbols used in this cryptogram sought for cryptanalysis.

It could happen sometimes that short cryptograms are encountered, too short to contain all the symbols of the alphabet, and where the order of letter frequency cannot be applied. Indeed the order of letter frequency can normally be applied in long texts, where the scarcity of letters in one part of the text is compensated for by their abundance in another part.

Consequently, if the cryptogram was short, then the correlation between the order of letter frequency in it and in that of the language would no longer be reliable, and thereupon you should use another, qualitative expedient in cryptanalyzing the letters. It is to detect in the language in which cryptograms are enciphered the associable letters and the dissociable ones. When you discern two of them using the letter order of frequency, you see whether they are associable in that language. If so, you seek each of them elsewhere in the cryptogram, comparing it with the preceding and following dissociable letters by educing from the order of frequency of letters, so as to see whether they are combinable or non-combinable. If you find that all these letters are combinable with that letter, you look for letters combinable with the second letter. If found really combinable, so they are the expected letters suggested by the combination and non-combination of letters, and also by their order of frequency. Those expected letters are correlated with words that make sense. The same procedure is repeated elsewhere in the ciphertext until the whole message is cryptanalyzed." [12, vol. 1, pp. 121-123]

Falconer's Analysis Step 1-4:

First, Distinguish the Vowels from the Consonants
  1. And first, the vowels generally discover themselves by their frequency; for because they are but few in number, and no word made up without some of them, they must frequently be used in any writing.
  2. Where you find any character or letter standing by it self, it must be a vowel.
  3. If you find any character doubled in the beginning of a word, in any language it is a vowel, as Aaron, Eel, Jilt, Oogala, Vulture, etc., except for some English proper names, as Llandaff or Lloyd.
  4. In monosyllables of two letters you may distinguish it from the consonant joined with it by its frequency.
  5. In a word of three letters beginning and ending in the same letter the vowel is probably included.
  6. When you find a character doubled in the middle of a word of four letters, `tis probably the vowel e or o.
  7. In Polysyllables, where a character is double in the middle of the word, it is for the most part a consonant; and if so, the precedent letter is always a vowel, and very often the following.[4, pp. 8-9]
Secondly, Distinguish the Vowels from Themselves.
  1. Compare their frequency, and e, as we observed before, is generally the most used in the English tongue, next o, then a and i; but u and y are not so frequently used as some of the consonants.
  2. It is remarkable that amongst the vowels, e and o are often doubled, the rest seldom or never.
  3. e is very often a terminal letter, and y terminates words, but they are distinguishable, because there is no proportion to their frequency: o is not often in the end of words, except in monosyllables.
  4. e is the only vowel that can be doubled in the end of an English word, except o in too, etc.
  5. You may consider which of the vowels, in any language, can stand alone, as a, i, and sometimes o in English, a, e, o, in Latin or i the imperative of eo.[4, pp. 9-10]
Distinguish the Consonants Amongst Themselves.
  1. As before observe their frequency. Those of most use in English are d, h, n, r, s, t, and next to those may be reckoned c, f, g, l, m, w, in third rank may be placed b, k, p, and lastly q, x, z...
  2. You may consider which consonants may be doubled in the middle or end of words.
  3. What are terminal letters, etc.
  4. The number and nature of consonants and vowels that fall together, or do usually fall together.[4, p. 10]
Additional Observations
  1. A word of three letters, beginning and ending with the same, may be supposed did
  2. A word consisting of four characters, with the same letter in the beginning and end, is probably that or hath
  3. A word consisting of five letters, when the second and last are the same, is commonly which, though it may be otherways, as in known, serve, etc. And you may judge of the truth of such suppositions by the frequency of the letters in the word supposed.

Next you may compare words one with another, as on and no, each being the other reversed; so of and for, the last being the first reversed with the addition of a letter; for and from will discover each other, etc.

You may also likewise observe some of the usual propositions and terminations of words, such as com, con, ing, ed, etc. Note that t and h are often joined in the beginning and end of English words, and sometimes in the middle.[4, pp. 11-12]