HOW PUBLIC KEY ENCRYPTION WORKS (PKE demonstration software by D. Canright) SUMMARY: The method of public-key encryption used here is called the RSA algorithm (invented in 1977 by Ron Rivest, Adi Shamir, and Leonard Adleman). First, a keyset is generated: a private key, a public key, and a modulus. Then information encoded using the public key (and modulus) can only be decoded using the private key (and modulus); similarly, information encoded using the private key (and modulus) can only be decoded using the public key (and modulus). Here are the steps involved in generating the keyset: (1) find two large prime numbers: p & q [Note: a prime number, like 7 or 25771, has no factors other than itself and 1.] (2) let n = p * q ; n is the modulus (3) let c = (p-1)*(q-1) ; c (called the totient) is always even (4) choose a private key number e, less than n, such that e & c have no common factors (are relatively prime) (5) then the public key d is found by d = e^(-1) mod c (that is, d*e mod c = 1) [Note: of course, you could first choose the public key e, then the private key would be d.] The word "mod" used in step 5 above means that modulo arithmetic is used. For example, think of a clock. If we start at 9 o'clock, and add 8 hours (eight hours later), we get 5 o'clock, not 17 o'clock; the numbers "wrap around". We would write this example addition as 9 + 8 mod 12 = 5 . Multiplication (which doesn't make much sense on a clock) works similarly in modulo arithmetic: if the result is bigger than the modulus, you subtract off the modulus as many times as necessary until the result is less than the modulus. So 7 * 5 mod 12 = 11 (that is, 7*5 mod 12 = 35 mod 12 = 12*2+11 mod 12 = 11). To encode a message, converted to some number m , then the secret s = m ^ d mod n (that is, m raised to the d power, modulo n) and to decode the secret the message m = s ^ e mod n As long as n is so large that nobody can figure out how to factor it into p & q, then nobody can find out the secret key e from the public key d and modulus n. SIMPLE EXAMPLE: Here is an example using very small numbers. Say the two primes are p = 3, q = 11 then the totient is c = (3-1)*(11-1) = 20 . Say we choose the private key e = 7 then the public key would be d = 3 because e*d mod c = 7*3 mod 20 = 21 mod 20 = 1, and the modulus is n = p*q = 3*11 = 33 . (Note: with this small modulus, our message must be broken into pieces small enough to represent with a number less than 33, so we could do one letter at a time.) Suppose the secret message is the letter "P", which we could write as the number 16 (since P is the 16th letter in the alphabet), m = 16 Then the encoded secret is s = m^d mod n = 16^3 mod 33 = 4096 mod 33 = 33*124+4 mod 33 = 4 and to decode the secret again m = s^e mod n = 4^7 mod 33 = 16384 mod 33 = 33*496+16 mod 33 = 16 which gives us back the original message: 16 (or "P"). (Note: there are techniques of raising numbers to powers in modulo arithmetic that avoid calculating vary large numbers.) REAL CRYPTOGRAPHY: The size of the keys and modulus in cryptography is often measured in the number of digits, or equivalently, the number of "bits" in the binary representation (roughly 3 bits per digit, or more accurately 0.3 digits per bit). In this demo software, the modulus is (almost) always 5 digits (15 bits) and the keys are always smaller than the modulus. This is big enough to make the modulus hard to factor by pencil and paper, and keys big enough that they are hard to guess, but of course with a computer it is trivial to factor a 15-bit number. (Actually, if you know which factors to check, it is not too hard with paper and pencil; see below for more info.) So this program is NOT intended to give secure encryption. In real public-key encryption, for keys to be secure they are often 1024 bits (~307 digits) or larger. [Note: for conventional (not public-key) cryptography with a single secret key, equivalent security would require only about 80 bits (~24 digits).] See the About PKE window for information on serious cryptography programs. WHY THE RSA ALGORITHM WORKS: The central mathematical theorem behind the RSA algorithm is called Euler's Theorem: If n is a positive integer and m is an integer that is relatively prime to n, then m^c mod n = 1 , where c is the number of positive integers less than n that are relatively prime to n. Relatively prime means the two numbers have no positive factors in common (other than 1), for example 12 and 25 are relatively prime. One name for c above is the totient of n (or the Euler Phi function of n). For the RSA algorithm, n=p*q where p & q are prime numbers. Then the numbers less than n that have common factors are the multiples of p (q-1 of them) and the multiples of q (p-1 of them). So the totient of n includes all the rest: c = (n-1) - (q-1) - (p-1) = p*q - p - q + 1 = (p-1)*(q-1) . Then with the two keys e & d chosen so that e*d mod c = 1, then e*d = c*k + 1 for some number k. Given a message number m, then the secret number s is s = m^d mod n and so to decode we find s^e mod n = (m^d)^e mod n = m^(d*e) mod n = m^(c*k+1) mod n = (m^c)^k * m^1 mod n = 1^k * m mod n = 1*m mod n = m because m^c mod n = 1 by Euler's Theorem. (Actually, Euler's Theorem only applies if m is relatively prime to n, but it can be shown that this still works in that case due to other theorems.) For the proof of Euler's Theorem, see any text on Number Theory, such as _Elementary Number Theory and its Applications_ by Kenneth H. Rosen, 1993, Addison Wesley. IMPLEMENTATION DETAILS FOR THE PKE DEMO PROGRAM: In designing this program, I decided the program should really perform the RSA algorithm, but I wanted to avoid numbers bigger than the computer can represent in one "word" (16 bits, including one sign bit), which limits me to positive numbers up to 2^15 - 1 = 32767. I also wanted to be able to encode any printable ASCII character, of which there are 95 including upper & lower case, digits, and punctuation. Then 2-character blocks could be represented by numbers up to 95^2 = 9025, while 3-character blocks would require numbers up to 95^3 = 857375, too big. So I wanted every modulus n to be between 9025 and 32767. Hence I chose a limited set of primes to use in generating key sets; I only use primes from 97 up to 181 (18 primes in all), so the smallest modulus is 97*101 = 9797 and the largest modulus is 179*181 = 32399. The message is broken into 2-character blocks, and each block is converted to a number m (using base-95 for the characters) less than 9025. But when m is encoded to the secret block s = m^e mod n, the result could be as big as the modulus, possibly up to 32398. And I wanted to represent the secret in printable ASCII characters too (base-95 again), which means I need 3-character blocks for the encoded secret to represent the 2-character blocks of the original message. (Actually I shift the leading character of each block by 1 to avoid excessive blanks [zeros] in the secret.) If the original message has an odd number of characters, a blank is added at the end to complete the last 2-character block. In generating a new keyset, given the private key e (which must be an odd number to avoid common factors with the totient c), I first find all the primes p in my list where p-1 is relatively prime to e, then choose two of those at random to get the modulus (which must be bigger than e). So making a new keyset using the same private key as another will probably result in a different public key and modulus. (Warning: some odd numbers will not work as private keys, using my limited set of primes.)