MATH 470 Lecture 5
« previous | Tuesday, January 29, 2013 | next »
What is the strongest code? One-time pad
- bit + random = random
- key would be as big as message, so we often do the best we can with small keys (e.g. vigenere)
Breaking Vigenere
DQZXI . . . ZOVXMJLFYFEHL (266071 chars)
Suppose you know vigenere was used and the language was English. How do you find the key length?
- Count collisions of cipher text with shifts
DQZXI . . . A . . . Q . . .EHL DQZXI . A . . . Q . . . . . EHL * *
The shift that gives the most collisions will give the most likely key length.
You want to do () operations of work; no more than that.
How long of a key gets you key space approximately the length of the message?
- Let's try 10 × 3.8 ≈ 40 shifts of incrementing amounts.
In Matlab:
freq = zeros(1,40) for i = 1:40 count = 0; slen = 266071-i-1; for j = 1:slen if cnums(j+i) == cnums(j) count = count + 1 end end freq(i) = count/slen end [y,i] = max(freq)
If you plot the frequency of collisions, the peaks occur at regular intervals.
For the case of our example, 13, 26, 39, etc., so we'll try with a key length of 13
Assume key = (k1, . . ., k13)
Do frequency analysis of the most frequent character along positions:
- (1, 14, 27, ...) is Z, so it may be or +21
- (2, 15, 28, ...) is B, so it may be or +23
- ...
Key should be (21, 23, 3, ..., 25, 24)
Amount of work: reading ciphertext ~41 times (about 2 sec on computer)
Assuming the key is correct, we decrypt the text by subtracting.
ITWASABOUTELEVENOCLOCK . . . INEVERSAWHERAGAIN
The Big Sleep by Raymond Chandler (1946)
Comparison with Brute Force
Even if you knew that the key length was 13, brute force implies trying 2613 keys
Computational assumption: English text behaves like strings from { A, B, ..., Z } leaving a particular probability distribution.
This works because
- the probability that a character at position is the same as character in position
- approximately the probability that a character in a random stream is the same as the character steps ahead
- going steps ahead,
- if is a multiple of the key length, they should have the same distribution
- if is not a multiple of the key length, they do not have the same distribution
- Evaluating the probability estimate gives
Pr[char = A && steps ahead = A] + ... + Pr[char = Z && steps ahead = Z]
Pr[char = A] * Pr[char = σ(A) ] + ... + Pr[char = Z] * Pr[char = σ(Z)]
Where σ is a permutation function
By the rarrangement inequality, this is maximized when σ is the identity function
RSA
Rivest, Shamir, Adleman in 1977 (previously discovered by Clifford Cocks at GCHQ in 1970)
(See CSCE 222 Lecture 34→)
Our first example of asymmetric encryption: decryption key is not an easy function of the encryption key.
Imagine Bob wants to communicate with Alice and bob will set everything up:
- Bob picks secret large primes and with approximately equal number of digits and computes
Bob also picks a large , relatively prime to , and computes - Bob broadcasts and , but keeps , , and secret
- Alice encrypts via
- Bob decrypts via
RSA is used in
- routers
- financial transactions
- digital signatures
Notes
- Typically, and have hundreds of digits!
- Messages can naturally be placed in by blocks:
- HELLOHELP... becomes a sequence of digits mod 26, which is just a big integer
Quiz next Tuesday